Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Permit INV_TRANS with OpenMP offload #196

Draft
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

samhatfield
Copy link
Collaborator

Now that we can compile with OpenMP offload, I will gradually add back this capability to ecTrans until we can run the benchmark program.

The purpose of this PR is to allow, at least in a minimal fashion, OpenMP offload to be used with the inverse transform. This PR can be merged once this program produces the correct behaviour:

PROGRAM TEST_PROGRAM

USE PARKIND1, ONLY: JPIM, JPRB

IMPLICIT NONE

! Spectral truncation
INTEGER(JPIM) :: TRUNC = 79

! Arrays for storing our field in spectral space and grid point space
REAL(KIND=JPRB), ALLOCATABLE :: SPECTRAL_FIELD(:,:)
REAL(KIND=JPRB), ALLOCATABLE :: GRID_POINT_FIELD(:,:,:)

! Dimensions of our arrays in spectral space and grid point space
INTEGER(KIND=JPIM) :: NSPEC2
INTEGER(KIND=JPIM) :: NGPTOT

#include "setup_trans0.h"
#include "setup_trans.h"
#include "trans_inq.h"
#include "inv_trans.h"

!CALL DR_HOOK_INIT()

! Initialise ecTrans (resolution-agnostic aspects)
CALL SETUP_TRANS0(LDMPOFF=.TRUE., KPRINTLEV=2)

! Initialise ecTrans (resolution-specific aspects)
CALL SETUP_TRANS(KSMAX=TRUNC, KDGL=2 * (TRUNC + 1))

! Inquire about the dimensions in spectral space and grid point space
CALL TRANS_INQ(KSPEC2=NSPEC2, KGPTOT=NGPTOT)

! Allocate our work arrays
ALLOCATE(SPECTRAL_FIELD(1,NSPEC2))
ALLOCATE(GRID_POINT_FIELD(NGPTOT,1,1))

! Initialise our spectral field array
SPECTRAL_FIELD(:,:) = 0.0_JPRB

! Perform an inverse transform
CALL INV_TRANS(PSPSCALAR=SPECTRAL_FIELD, PGP=GRID_POINT_FIELD)

END PROGRAM TEST_PROGRAM

This just performs an inverse transform on a single 2D scalar field of zeroes.

@samhatfield samhatfield added enhancement New feature or request gpu labels Jan 23, 2025
@samhatfield samhatfield force-pushed the refresh_openmp_inv_trans branch from c14da12 to 0a390d0 Compare January 23, 2025 14:58
@samhatfield
Copy link
Collaborator Author

Current state of play: the test program crashes in LEINV on the first instance of

!$OMP TARGET DATA USE_DEVICE_PTR(ZAA,ZINP,ZOUTA)

with error

Memory access fault by GPU node-4 (Agent handle: 0xf82500) on address 0xe2b000. Reason: Unknown.

The similar statement above for the 0-mode double-precision arrays (ZAA0 etc.) does not produce this error.

@samhatfield
Copy link
Collaborator Author

After further investigation, it seems that ZAA etc. are correctly transferred to device, which is good news. The issues is that the first call to a routine in HICBLAS_MOD (HIP_DGEMM_BATCHED_OVERLOAD) somehow leaves the device in a corrupted state. Accesses to ZAA before this call are successful (the sum of the array computed on device matches that computed on host) but the exact same computation after the call triggers the above Memory access fault error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request gpu
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant