Skip to content

Parallel computing with OpenCL on GPUs and CPUs.

Notifications You must be signed in to change notification settings


Repository files navigation


Parallel computing with OpenCL on GPUs and CPUs.

■ 1. LUX.GPU.OpenCL Library

TOpenCL :Singleton of TCLSystem
TCLSystem :System
 ┗TCLPlatfos :Platform list
   ┗TCLPlatfo :Platform
     ┣TCLExtenss :Extension list
     ┣TCLDevices :Device list
     ┃ ┗TCLDevice :Device
     ┗TCLContexs :Context list
       ┗TCLContex :Context
         ┣TCLQueuers :Command Queue list
         ┃ ┗TCLQueuer :Command Queue
         ┣TCLArgumes :Argument list
         ┃ ┣TCLBuffer :Buffer
         ┃ ┣TCLImager :Image
         ┃ ┗TCLSamplr :Sampler
         ┣TCLLibrars :Library list
         ┃ ┗TCLLibrar :Library
         ┗TCLExecuts :Executable program list
           ┗TCLExecut :Executable program
             ┣TCLBuildrs :Build list
             ┃ ┗TCLBuildr :Build
             ┗TCLKernels :Kernel list
               ┗TCLKernel :Kernel
                 ┗TCLParames :Parameter list
                   ┗TCLParame :Parameter

■ 2. Usage

The TOpenCL class is a singleton of the TCLSystem class. The TCLSystem class automatically detects all computing devices present on the host machine.

⬤ 2.1. Platform

The "platform" object (TCLPlatfo) represents the environment provided by each device vendor. The TCLSystem class automatically detects all platforms and lists them in the Platfos[] property.

Object Pascal

TOpenCL.Platfos.Count :Integer    // Number of all platforms
TOpenCL.Platfos[*]    :TCLPlatfo  // Array of all platforms

The TCLPlatfo class provides information about a specific platform as properties.

Object Pascal

_Platfo := TOpenCL.Platfos[0];  // Selecting a specific platform

_Platfo.Handle        :T_cl_platform_id  // Pointer
_Platfo.Profile       :String            // Profile
_Platfo.Version       :String            // Version
_Platfo.Name          :String            // Name
_Platfo.Vendor        :String            // Vendor Name
_Platfo.Extenss.Count :Integer           // Number of Extensions
_Platfo.Extenss[*]    :String            // Array of Extensions

⬤ 2.2. Device

The "device" object (TCLDevice) represents a physical GPU or CPU. The TCLPlatfo class automatically detects all device objects in a specific platform object and enumerates them in the Devices[] property.

Object Pascal

_Platfo.Devices.Count :Integer    // Number of devices
_Platfo.Devices[*]    :TCLDevice  // Array of devices

The TCLDevice class provides detailed information about each specific device through its properties.

Object Pascal

_Device := _Platfo.Devices[0];  // Selecting a specific device

_Device.Handle           :T_cl_device_id    // Pointer
_Device.DEVICE_TYPE      :T_cl_device_type  // Type
_Device.DEVICE_VENDOR_ID :T_cl_uint         // Vendor ID
_Device.DEVICE_NAME      :String            // Name
_Device.DEVICE_VENDOR    :String            // Vendor
_Device.DRIVER_VERSION   :String            // Driver Version
_Device.DEVICE_PROFILE   :String            // Profile
_Device.DEVICE_VERSION   :String            // Version

⬤ 2.3. Context

The "context" object (TCLContex) manages a collection of related data and programs. The TCLContex class is instantiated from the TCLPlatfo class.

Object Pascal

_Contex := TCLContex.Create( _Platfo );

The generated TCLContex class is registered in the Contexs[] property of the TCLPlatfo class.

Object Pascal

_Platfo.Contexs.Count :Integer    // Number of contexts
_Platfo.Contexs[*]    :TCLContex  // Array  of contexts

⬤ 2.4. Command Queue

The "command queue" object (TCLQueuer) manages the commands sent to the device. In other words, it serves as a bridge between a context and a device. The TCLQueuer class is created with the TCLContex and TCLDevice classes as arguments.

Object Pascal

_Queuer := TCLQueuer.Create( _Contex, _Device );
_Queuer := _Contex.Queuers[ _Device ];

The TCLContex class registers the TCLQueuer object in the Queuers[] property.

Object Pascal

_Contex.Queuers.Count :Integer    // Number of command queue
_Contex.Queuers[*]    :TCLQueuer  // Array  of command queue

Note that context and device on the different platforms cannot generate a command queue.

Object Pascal

P0 := TOpenCL.Platfos[0];
P1 := TOpenCL.Platfos[1];
P2 := TOpenCL.Platfos[2];

D00 := P0.Devices[0];  D01 := P0.Devices[1];  D02 := P0.Devices[2]; 
D10 := P1.Devices[0];
D20 := P2.Devices[0];

C0 := TCLContex.Create( P0 ); 
C1 := TCLContex.Create( P1 ); 
C2 := TCLContex.Create( P2 );

Q00 := TCLQueuer.Create( C0, D00 );  // OK
Q01 := TCLQueuer.Create( C0, D01 );  // OK
Q02 := TCLQueuer.Create( C0, D02 );  // OK

Q10 := TCLQueuer.Create( C1, D00 );  // Error
Q11 := TCLQueuer.Create( C1, D01 );  // Error
Q12 := TCLQueuer.Create( C1, D02 );  // Error

Q20 := TCLQueuer.Create( C2, D00 );  // Error
Q21 := TCLQueuer.Create( C2, D10 );  // Error
Q22 := TCLQueuer.Create( C2, D20 );  // OK

⬤ 2.5. Argument

 ┃ ┣TCLBuffer
 ┃ ┗TCLImager

▼ 2.5.1. Memory

The "memory" object (TCLMemory) stores various data and shares it with the device. The TCLMemory class is created from the TCLContex and the TCLQueuer classes. The TCLMemory class is abstract and derives the TCLBuffer and TCLImager classes.

▽ Buffer

The TCLBuffer class stores an array of any "simple type" or "record type."

If you want to send an array of the following structure type to the device,

OpenCL C

typedef struct {
  int    A;
  double B;
} TItem;

kernel void Main( global TItem* Buffer ) {

generate the TCLBuffer class as follows.

Object Pascal

TItem = record
  A :Integer;
  B :Double;

_Buffer := TCLBuffer<TItem>.Create( _Contex, _Queuer );

Read and write array data through the Data property. The array data must be "mapped" to synchronize with the host before reading or writing, and "unmapped" to synchronize with the device after use.

Object Pascal

_Buffer.Count := 3;                          // Setting the number of elements
_Buffer.Data.Map;                            // Synchronize data with host
_Buffer.Data[0] := TItem.Create( 1, 2.34 );  // Writing
_Buffer.Data[1] := TItem.Create( 5, 6.78 );  // Writing
_Buffer.Data[2] := TItem.Create( 9, 0.12 );  // Writing
_Buffer.Data.Unmap;                          // Synchronize data with Device

▽ Image

The "image" object (TCLImager) stores the pixel array in 1D to 3D. 3D voxel data is also considered a type of 3D image. The TCLImager class is abstract and derives various classes depending on the layout and bits of the color channel.

 ┃ ┣TCLImager1DxBGRAxUInt8
 ┃ ┣TCLImager1DxBGRAxUFix8
 ┃ ┣TCLImager1DxRGBAxUInt32
 ┃ ┗TCLImager1DxRGBAxSFlo32
 ┃ ┣TCLImager2DxBGRAxUInt8
 ┃ ┣TCLImager2DxBGRAxUFix8
 ┃ ┣TCLImager2DxRGBAxUInt32
 ┃ ┗TCLImager2DxRGBAxSFlo32

The first part of the class name represents the dimension of a image.

  • TCLImager1Dx*x*
    • Dimension:1D
  • TCLImager2Dx*x*
    • Dimension:2D
  • TCLImager3Dx*x*
    • Dimension:3D

The second part of the class name represents the color channel order of a image.

  • TCLImager*xBGRAx*
    • Color channel order:BGRA
  • TCLImager*xRGBAx*
    • Color channel order:RGBA

The third part of the class name represents the color data type of a image.

  • TCLImager*x*xUInt8
    • Device-side data type:uint8 @ OpenCL C
    • Host-side data type:UInt8 (Byte) @ Delphi
  • TCLImager*x*xUFix8
    • Device-side data type:float @ OpenCL C
    • Host-side data type:UInt8 (Byte) @ Delphi
  • TCLImager*x*xUInt32
    • Device-side data type:uint @ OpenCL C
    • Host-side data type:UInt32 (Cardinal) @ Delphi
  • TCLImager*x*xSFlo32
    • Device-side data type:float @ OpenCL C
    • Host-side data type:Single @ Delphi

The 'CountX'/'Y'/'Z' property sets the number of pixels in the X/Y/Z direction.

Object Pascal

_Imager := TCLDevIma3DxBGRAxUInt8.Create( _Contex, _Queuer );
_Imager.CountX := 100;  // Number of pixels in the X direction
_Imager.CountY := 200;  // Number of pixels in the Y direction
_Imager.CountZ := 300;  // Number of pixels in the Z direction

▼ 2.5.2. Sampler

The sampler object (TCLSamplr) defines the interpolation method to get the pixel color in real-number coordinates.
The TCLSamplr class is generated with the 'TCLContex' class as an argument.

Object Pascal

_Samplr := TCLSamplr.Create( _Contex );

⬤ 2.6. Program

The "program" object (TCLProgra) reads the source code and compiles it into an executable binary. The TCLProgra class is created with the TCLContex class as an argument. The TCLProgra class is abstract and serves as the base class for the TCLLibrar and TCLExecut classes, depending on the type of source code.

▼ 2.6.1. Library

The TCLLibrar class is a program that does not include functions to execute directly is called a library type.

Object Pascal

_Librar := TCLLibrar.Create( _Contex );

_Librar.Source.LoadFromFile( '' );  // load Sourcecode

▼ 2.6.2. Executable

The TCLExecut class is a program that includes functions (Kernels) to execute directly.

Object Pascal

_Execut := TCLExecut.Create( _Contex );

_Execut.Source.LoadFromFile( '' );  // load Sourcecode

⬤ 2.7. Build

A "build" (TCLBuildr) is an "action" performed by a program, but it is explicitly represented as a class in our library.

Object Pascal

_Buildr := TCLBuildr.Create( _Execut, _Device );
_Buildr := _Execut.Buildrs[ _Device ];
_Buildr := _Execut.BuildTo( _Device );

The kernel object (see chapter 2.8.) automatically creates the TCLBuildr class at runtime. However, you can check for compilation and linking errors by creating a TCLBuildr object before running the kernel.

Object Pascal

_Buildr.Handle;  // Run build

_Buildr.CompileStatus :T_cl_build_status  // Compile status
_Buildr.CompileLog    :String             // Compile log
_Buildr.LinkStatus    :T_cl_build_status  // Link status
_Buildr.LinkLog       :String             // Link log

⬤ 2.8. Kernel

The "kernel" object (TCLKernel) represents an executable function in a program.

OpenCL C

kernel void Main( ・・・ ) {

The TCLKernel class is instantiated from the TCLExecut and TCLQueuer objects.

Object Pascal

_Kernel := TCLKernel.Create( _Execut, 'Main', _Queuer );
_Kernel := _Execut.Kernels.Add( 'Main', _Queuer );

▼ 2.8.1. Parameter

The memory object is linked to the parameter in the source code through the "Parames" property of the TCLKernel class.

Object Pascal

_Kernel.Parames['Buffer'] := _Buffer;  // Connect to buffer
_Kernel.Parames['Imager'] := _Imager;  // Connect to image
_Kernel.Parames['Samplr'] := _Samplr;  // Connect to sampler

▼ 2.8.2. Loop Count

The OpenCL program repeatedly runs like a triple loop-statement.

Object Pascal

_Kernel.GloSizX := 100;  // Number of loops in X direction
_Kernel.GloSizY := 200;  // Number of loops in Y direction
_Kernel.GloSizZ := 300;  // Number of loops in Z direction

You can also specify the minimum and maximum loop indices.

Object Pascal

_Kernel.GloMinX := 0;      // Start index in X direction
_Kernel.GloMinY := 0;      // Start index in Y direction
_Kernel.GloMinZ := 0;      // Start index in Z direction

_Kernel.GloMaxX := 100-1;  // End index in X direction
_Kernel.GloMaxY := 200-1;  // End index in Y direction
_Kernel.GloMaxZ := 300-1;  // End index in Z direction

▼ 2.8.3. Run

Object Pascal

_Kernel.Run;  // Run

■ 3. Reference

⬤ 3.2. GitHub


Parallel computing with OpenCL on GPUs and CPUs.







No releases published


No packages published
