Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add Graph Pass for template variable generation. #26

Open
KuangjuX opened this issue Nov 20, 2024 · 0 comments
Open

Feature: Add Graph Pass for template variable generation. #26

KuangjuX opened this issue Nov 20, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request feature

Comments

@KuangjuX
Copy link
Member

KuangjuX commented Nov 20, 2024

Currently, ThrillerFlow implements code generation based on graph dataflow. However, a complete kernel requires not only the core data flow but also operations for variable declarations and template definitions.

Taking GEMM as an example:

template <typename InType, typename AccType,                  //
          const int kM, const int kN, const int kK,           //
          const int kTM, const int kTN, const int kTK,        //
          typename GIteratorA, typename SIteratorA,           //
          typename SharedA, typename RegA,                    //
          typename G2SLoaderA, typename S2RLoaderA,           //
          typename GIteratorB, typename SIteratorB,           //
          typename SharedB, typename RegB,                    //
          typename G2SLoaderB, typename S2RLoaderB,           //
          typename GlobalC, typename SharedC, typename RegC,  //
          typename R2SStorerC, typename S2GStorerC>
__global__ void gemm(const InType* dA, const InType* dB, AccType* dC) {
    int offset_a = blockIdx.x * kTM * kK;
    int offset_b = blockIdx.y * kTN * kK;
    int offset_c = blockIdx.x * kTM * kN + blockIdx.y * kTN;

    extern __shared__ __align__(sizeof(double)) unsigned char buf[];
    InType* sA_ptr = reinterpret_cast<InType*>(buf);
    InType* sB_ptr = sA_ptr + SIteratorA::Tile::kNumel;
    AccType* sC_ptr = reinterpret_cast<AccType*>(buf);

    // declare tiles, iterators and loaders
    GIteratorA gAs(dA + offset_a);
    SIteratorA sAs(sA_ptr);

    GIteratorB gBs(dB + offset_b);
    SIteratorB sBs(sB_ptr);

    SharedA sA(sA_ptr);
    RegA rA;

    SharedB sB(sB_ptr);
    RegB rB;

    RegC acc;
    SharedC sC(sC_ptr);
    GlobalC gC(dC + offset_c);

    G2SLoaderA g2s_a;
    S2RLoaderA s2r_a;

    G2SLoaderB g2s_b;
    S2RLoaderB s2r_b;

    R2SStorerC r2s_c;
    S2GStorerC s2g_c;

    for (int k1 = 0; k1 < GIteratorA::sc1; ++k1) {
        g2s_a(gAs(k1), sA);
        g2s_b(gBs(k1), sB);
        __copy_async();
        __syncthreads();

        for (int k2 = 0; k2 < SIteratorA::sc1; ++k2) {
            s2r_a(sAs(k2), rA);
            s2r_b(sBs(k2), rB);

            compute::gemm(rA, rB, acc);
        }
    }
    r2s_c(acc, sC);
    __syncthreads();
    s2g_c(sC, gC);
}

In addition to the core dataflow, a complete kernel needs to declare variables at different memory levels, handle copy operations between different memory levels (usually defined as templates), and implement TileIterators for various memory variables.

Therefore, this requires a preliminary Graph Pass to traverse all the information in the graph and perform the generation.

  • For memory variables, it is sufficient to traverse all memory nodes in the entire graph, generating them sequentially.
  • For Load/Store, different templates will be constructed for different shapes. However, this part can be directly placed in the template and constructed in the configuration. In the dataflow graph, the edges between memory nodes should be declared as Load/Store operations at different memory levels.
  • For TileIterator, this means continuously tiling a block of memory data during the load/store process. Therefore, it is necessary to analyze the edges between memory nodes and generate the TileIterator.
@KuangjuX KuangjuX self-assigned this Nov 20, 2024
@KuangjuX KuangjuX added enhancement New feature or request feature labels Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature
Projects
None yet
Development

No branches or pull requests

1 participant