-
Notifications
You must be signed in to change notification settings - Fork 0
/
text
76 lines (67 loc) · 4.17 KB
/
text
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
https://lmsys.org/blog/2023-11-21-lookahead-decoding/?continueFlag=a88848711e1cc65fc7bc23868fcb93a3
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#main
Arcadia: An end-to-end AI system performance simulator
Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training
https://leiblog.wang/a-hgemm-tvm-schedule/
AITemplate FlexGen buddy-mlir llvm-project transformers
Awesome-LLM-Inference LoRA cuda-samples mlc-llm triton
ChezScheme Medusa cutlass oneflow tutel
ColossalAI Megatron-LM dist-ir optimum tvm
DeepSpeed Paddle fairscale pytorch vllm
DeepSpeed-MII PyProf flash-attention pytorch-generate xformers
DeepSpeedExamples TensorRT-LLM gpu_poor ray
EnergonAI accelerate iree rmm
FastChat alpa jax tensorflow
FasterTransformer astra-sim lightllm text-generation-inference
Unified Programming Models for Heterogeneous High-Performance Computers.
BladeDISC: Optimizing Dynamic Shape Machine Learning Workloads via Compiler Approach
PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR
PERFLOW: A Domain Specific Framework for Automatic Performance Analysis of Parallel Applications
Layer Normalization (LayerNorm) 是一种常见的神经网络层,用于训练深度学习模型。它的主要作用是规范化神经网络的激活值,有助于提高模型的训练效率和稳定性。
在 C++ 中,可以通过实现 LayerNorm 层来进行 CPU 上的计算。以下是一个简单的示例代码,演示了如何在 C++ 中实现 LayerNorm 层:
```cpp
#include <vector>
#include <cmath>
// LayerNorm 层类
class LayerNorm {
public:
LayerNorm(int feature_size, float eps)
: feature_size_(feature_size), eps_(eps) {
// 初始化 gamma 和 beta 的值
gamma_ = std::ones<float>(feature_size_);
beta_ = std::zeros<float>(feature_size_);
}
// 对输入数据进行 LayerNorm 处理
void forward(const std::vector<float>& x, std::vector<float>& y) {
const int batch_size = x.size() / feature_size_;
const float mean = 0.0;
const float var = 1.0;
std::vector<float> mean_buf(batch_size), var_buf(batch_size);
// 计算每个样本的均值和方差
for (int i = 0; i < batch_size; ++i) {
for (int j = 0; j < feature_size_; ++j) {
mean_buf[i] += x[i * feature_size_ + j] / feature_size_;
var_buf[i] += x[i * feature_size_ + j] * x[i * feature_size_ + j] / feature_size_;
}
}
// 计算 gamma 和 beta 值
for (int i = 0; i < feature_size_; ++i) {
gamma_[i] = 1.0 / std::sqrt(var_buf[i] + eps_);
beta_[i] = mean_buf[i] * gamma_[i];
}
// 对输入数据进行规范化处理
for (int i = 0; i < batch_size; ++i) {
for (int j = 0; j < feature_size_; ++j) {
y[i * feature_size_ + j] = (x[i * feature_size_ + j] - mean_buf[i]) * gamma_[j] + beta_[j];
}
}
}
private:
int feature_size_; // 特征大小
float eps_; // 防止除以零的参数
std::vector<float> gamma_; // gamma 值
std::vector<float> beta_; // beta 值
};
```
在上述代码中,我们定义了一个 `LayerNorm` 类,它包含一个构造函数和一个 `forward` 方法。构造函数用于初始化 gamma 和 beta 的值,`forward` 方法则用于对输入数据进行 LayerNorm 处理。在 `forward` 方法中,我们首先计算每个样本的均值和方差,然后根据公式计算 gamma 和 beta 值,最后对输入数据进行规范化处理。