sidebar_position | sidebar_label |
---|---|
1 |
Index |
This Open Catalog is a collaborative effort to consolidate expert knowledge on code guidelines for the correctness, modernization, and optimization of code written in C, C++, and Fortran programming languages. The Catalog consists of a comprehensive set of checks (rules) that describe specific issues in the source code and provide guidance on corrective actions, along with extensive documentation, example codes and references to additional reading resources.
The Open Catalog includes a suite of microbenchmarks designed to demonstrate:
- No performance degradation when implementing the correctness and modernization recommendations.
- Potential performance enhancements achievable through the optimization recommendations.
ID | Title | Category | C | Fortran | C++ | AutoFix |
---|---|---|---|---|---|---|
PWR001 | Declare global variables as function parameters | correctness | ✓ | ✓ | ✓ | |
PWR002 | Declare scalar variables in the smallest possible scope | correctness | ✓ | ✓ | ||
PWR003 | Explicitly declare pure functions | modernization | ✓ | ✓ | ✓ | |
PWR004 | Declare OpenMP scoping for all variables | correctness | ✓ | ✓ | ✓ | |
PWR005 | Disable default OpenMP scoping | correctness | ✓ | ✓ | ✓ | |
PWR006 | Avoid privatization of read-only variables | optimization | ✓ | ✓ | ✓ | |
PWR007 | Disable implicit declaration of variables | correctness | ✓ | ✓1 | ||
PWR008 | Declare the intent for each procedure parameter | correctness | ✓ | ✓1 | ||
PWR009 | Use OpenMP teams to offload work to GPU | optimization | ✓ | ✓ | ✓ | |
PWR010 | Avoid column-major array access in C/C++ | optimization | ✓ | ✓ | ||
PWR012 | Pass only required fields from derived type as parameters | optimization | ✓ | ✓ | ✓ | |
PWR013 | Avoid copying unused variables to or from the GPU | optimization | ✓ | ✓ | ✓ | |
PWR014 | Out-of-dimension-bounds matrix access | correctness | ✓ | ✓ | ||
PWR015 | Avoid copying unnecessary array elements to or from the GPU | optimization | ✓ | ✓ | ✓ | |
PWR016 | Use separate arrays instead of an Array-of-Structs | optimization | ✓ | ✓ | ✓ | |
PWR017 | Using countable while loops instead of for loops may inhibit vectorization | optimization | ✓ | ✓ | ||
PWR018 | Call to recursive function within a loop inhibits vectorization | optimization | ✓ | ✓ | ✓ | |
PWR019 | Consider interchanging loops to favor vectorization by maximizing inner loop's trip count | optimization | ✓ | ✓ | ✓ | |
PWR020 | Consider loop fission to enable vectorization | optimization | ✓ | ✓ | ✓ | |
PWR021 | Consider loop fission with scalar to vector promotion to enable vectorization | optimization | ✓ | ✓ | ✓ | |
PWR022 | Move invariant conditional out of the loop to facilitate vectorization | optimization | ✓ | ✓ | ✓ | |
PWR023 | Add 'restrict' for pointer function parameters to hint the compiler that vectorization is safe | optimization | ✓ | ✓ | ||
PWR024 | Loop can be rewritten in OpenMP canonical form | optimization | ✓ | ✓ | ||
PWR025 | Consider annotating pure function with OpenMP 'declare simd' | optimization | ✓ | ✓ | ✓ | |
PWR026 | Annotate function for OpenMP Offload | optimization | ✓ | ✓ | ✓ | |
PWR027 | Annotate function for OpenACC Offload | optimization | ✓ | ✓ | ✓ | |
PWR028 | Remove pointer increment preventing performance optimization | optimization | ✓ | ✓ | ||
PWR029 | Remove integer increment preventing performance optimization | optimization | ✓ | ✓ | ✓ | |
PWR030 | Remove pointer assignment preventing performance optimization for perfectly nested loops | optimization | ✓ | ✓ | ✓ | |
PWR031 | Replace pow by multiplication, division and/or square root | optimization | ✓ | ✓ | ✓ | |
PWR032 | Avoid calls to mathematical functions with higher precision than required | optimization | ✓ | ✓ | ||
PWR033 | Move invariant conditional out of the loop to avoid redundant computations | optimization | ✓ | ✓ | ✓ | |
PWR034 | Avoid strided array access to improve performance | optimization | ✓ | ✓ | ✓ | |
PWR035 | Avoid non-consecutive array access to improve performance | optimization | ✓ | ✓ | ✓ | |
PWR036 | Avoid indirect array access to improve performance | optimization | ✓ | ✓ | ✓ | |
PWR037 | Potential precision loss in call to mathematical function | correctness | ✓ | ✓ | ||
PWR039 | Consider loop interchange to improve the locality of reference and enable vectorization | optimization | ✓ | ✓ | ✓ | ✓1 |
PWR040 | Consider loop tiling to improve the locality of reference | optimization | ✓ | ✓ | ✓ | |
PWR042 | Consider loop interchange by promoting the scalar reduction variable to an array | optimization | ✓ | ✓ | ✓ | |
PWR043 | Consider loop interchange by replacing the scalar reduction value | optimization | ✓ | ✓ | ✓ | |
PWR044 | Avoid unnecessary floating-point data conversions involving constants | optimization | ✓ | ✓ | ||
PWR045 | Replace division with a multiplication with a reciprocal | optimization | ✓ | ✓ | ||
PWR046 | Replace two divisions with a division and a multiplication | optimization | ✓ | ✓ | ✓ | |
PWR048 | Replace multiplication/addition combo with an explicit call to fused multiply-add | optimization | ✓ | ✓ | ||
PWR049 | Move iterator-dependent condition outside of the loop | optimization | ✓ | ✓ | ✓ | |
PWR050 | Consider applying multithreading parallelism to forall loop | optimization | ✓ | ✓ | ✓ | ✓1 |
PWR051 | Consider applying multithreading parallelism to scalar reduction loop | optimization | ✓ | ✓ | ✓ | ✓1 |
PWR052 | Consider applying multithreading parallelism to sparse reduction loop | optimization | ✓ | ✓ | ✓ | ✓1 |
PWR053 | Consider applying vectorization to forall loop | optimization | ✓ | ✓ | ✓ | ✓1 |
PWR054 | Consider applying vectorization to scalar reduction loop | optimization | ✓ | ✓ | ✓ | ✓1 |
PWR055 | Consider applying offloading parallelism to forall loop | optimization | ✓ | ✓ | ✓ | ✓1 |
PWR056 | Consider applying offloading parallelism to scalar reduction loop | optimization | ✓ | ✓ | ✓ | ✓1 |
PWR057 | Consider applying offloading parallelism to sparse reduction loop | optimization | ✓ | ✓ | ✓ | ✓1 |
PWR060 | Consider loop fission to separate gather memory access pattern | optimization | ✓ | ✓ | ✓ | |
PWR062 | Consider loop interchange by removing accumulation on array value | optimization | ✓ | ✓ | ✓ | |
PWR063 | Avoid using legacy Fortran constructs | modernization | ✓ | |||
PWR068 | Encapsulate procedures within modules to avoid the risks of calling implicit interfaces | correctness | ✓ | |||
PWR069 | Use the keyword only to explicitly state what to import from a module | correctness | ✓ | ✓1 | ||
PWR070 | Declare array dummy arguments as assumed-shape arrays | correctness | ✓ | |||
PWR071 | Prefer real(kind=kind_value) for declaring consistent floating types | modernization | ✓ | |||
PWR072 | Split the variable initialization from the declaration to prevent the implicit 'save' behavior | correctness | ✓ | ✓1 | ||
PWR073 | Transform common block into a module for better data encapsulation | modernization | ✓ | |||
PWR075 | Avoid using GNU Fortran extensions | modernization | ✓ | |||
PWR079 | Avoid undefined behavior due to uninitialized variables | correctness | ✓ | ✓ | ✓ | |
PWD002 | Unprotected multithreading reduction operation | correctness | ✓ | ✓ | ✓ | |
PWD003 | Missing array range in data copy to the GPU | correctness | ✓ | ✓ | ✓ | |
PWD004 | Out-of-memory-bounds array access | correctness | ✓ | ✓ | ✓ | |
PWD005 | Array range copied to or from the GPU does not cover the used range | correctness | ✓ | ✓ | ✓ | |
PWD006 | Missing deep copy of non-contiguous data to the GPU | correctness | ✓ | ✓ | ✓ | |
PWD007 | Unprotected multithreading recurrence | correctness | ✓ | ✓ | ✓ | |
PWD008 | Unprotected multithreading recurrence due to out-of-dimension-bounds array access | correctness | ✓ | ✓ | ✓ | |
PWD009 | Incorrect privatization in parallel region | correctness | ✓ | ✓ | ✓ | |
PWD010 | Incorrect sharing in parallel region | correctness | ✓ | ✓ | ✓ | |
PWD011 | Missing OpenMP lastprivate clause | correctness | ✓ | ✓ | ✓ | |
RMK001 | Loop nesting that might benefit from hybrid parallelization using multithreading and SIMD | optimization | ✓ | ✓ | ✓ | |
RMK002 | Loop nesting that might benefit from hybrid parallelization using offloading and SIMD | optimization | ✓ | ✓ | ✓ | |
RMK003 | Potentially privatizable temporary variable | optimization | ✓ | ✓ | ||
RMK007 | Vectorization opportunity within a multithreaded region | optimization | ✓ | ✓ | ✓ | |
RMK008 | Vectorization opportunity within an offloaded region | optimization | ✓ | ✓ | ✓ | |
RMK009 | Outline loop to increase compiler and tooling code coverage | optimization | ✓ | ✓ | ||
RMK010 | Strided memory accesses in the loop body may prevent vectorization | optimization | ✓ | ✓ | ✓ | |
RMK012 | Conditional execution in the loop body may prevent vectorization | optimization | ✓ | ✓ | ✓ | |
RMK013 | Low trip count unknown at compile time may prevent vectorization of the loop | optimization | ✓ | ✓ | ✓ | |
RMK014 | Unpredictable memory accesses in the loop body may prevent vectorization | optimization | ✓ | ✓ | ✓ | |
RMK015 | Tune compiler optimization flags to increase the speed of the code | optimization | ✓ | ✓ | ✓ | |
RMK016 | Tune compiler optimization flags to avoid potential changes in floating point precision | correctness | ✓ | ✓ | ✓ |
AutoFix: Denotes tools that support automatic correction of the corresponding check. Readers are encouraged to report additional tools with autofix capabilities for these checks. The tools are tagged in the table as follows:
We welcome and encourage contributions to the Open Catalog! Here's how you can get involved:
-
Join the discussion:
Got ideas, questions, or suggestions? Head over to our GitHub Discussions. It's the perfect place for open-ended conversations and brainstorming!
-
Report issues:
Found inaccuracies, unclear explanations, or other problems? Please open an Issue. Detailed reports help us quickly improve the quality of the project!
-
Submit pull requests:
Interested in solving any issues? Feel free to fork the repository, make your changes, and submit a Pull Request. We'd love to see your contributions!