[Stream] Enable batch affinity queries in SpecializeEncoding pass. #19975

hanhanW · 2025-02-12T17:26:53Z

The returned function (i.e., ResolveLayoutAttrFn) can be very inefficient because there could be other data-flow analysis in a run. The revision updates the ResolveLayoutAttrFn API. Now it accepts a list of query, and it stores the results to the map of SetVector<Attribute>.

In the encoding specialization pass, it introduces StreamTensorOpUpdater class. There are two phases in the updater. The class caches all the queries in init(), and updates all the encodings in run(). The init method is introduced because there could be a failure in the initialization. In this context, we do not put them to the constructor because we can not signal the error in constructors. See https://google.github.io/styleguide/cppguide.html#Doing_Work_in_Constructors

The pass gets 440x speed-up for one of SDXL compilation.

The lit test configuration change (i.e., --pass-pipeline='builtin.module(iree-stream-specialize-encodings)') is needed because we want to validate failures for unsupported encodings.

The returned function (i.e., ResolveLayoutAttrFn) can be very inefficient because there could be other data-flow analysis in a run. The revision updates the ResolveLayoutAttrFn API. Now it accepts a list of query, and it stores the results to the map of SetVector<Attribute>. In the encoding specialization pass, it introduces `StreamTensorOpUpdater` class. There are two phases in the updater. The class caches all the queries in `init()`, and updates all the encodings in `run()`. The `init` method is introduced because there could be a failure in the initialization. In this context, we do not put them to the constructor because we can not signal the error in constructors. See https://google.github.io/styleguide/cppguide.html#Doing_Work_in_Constructors The pass gets 440x speedup for one of SDXL compilation. Signed-off-by: hanhanW <hanhan0912@gmail.com>

benvanik

minor style nits and otherwise lgtm!

benvanik · 2025-02-12T17:33:07Z

compiler/src/iree/compiler/Dialect/Stream/IR/StreamInterfaces.h

 using ResolveLayoutAttrFn = std::function<LogicalResult(
-    AffinityAttr, Operation *, SetVector<Attribute> &)>;
+    ArrayRef<AffinityAndOpPair>,


style: you can add argument names here to help readability and intellisense (e.g. std::function<void(int a, int b, int c)>)

Good point, thanks!

benvanik · 2025-02-12T17:35:33Z

compiler/src/iree/compiler/Dialect/Stream/Transforms/SpecializeEncodings.cpp

+  // encodings. See StreamInterfaces.h for more details.
+  IREE::Stream::ResolveLayoutAttrFn resolveLayoutAttr;
+};
+} // namespace


style: balanced whitespace
(but you may already be in an anonymous namespace from line 36 and not need this at all - it's hard to spot because there's no whitespace after it and it blends in with the following function :)

Suggested change

} // namespace

} // namespace

The problem with anonymous namespaces is that they naturally want to encourage indentation of their body, and they reduce locality of reference: if you see a random function definition in a C++ file, it is easy to see if it is marked static, but seeing if it is in an anonymous namespace requires scanning a big chunk of the file.

Because of this, we have a simple guideline: make anonymous namespaces as small as possible, and only use them for class declarations.

I agree that it's hard to spot, so I really like the LLVM style guide idea. Let me update the namespace to only scope these classes. The local methods already have static keywords, and they do not need to be within an anonymous namespace.

https://llvm.org/docs/CodingStandards.html#anonymous-namespaces

I forgot to mention that I fixed the namespace scope in the commit.

benvanik · 2025-02-12T17:37:03Z

compiler/src/iree/compiler/Dialect/Stream/Transforms/SpecializeEncodings.cpp

+};
+} // namespace
+
+StreamTensorOpUpdater::StreamTensorOpUpdater(ModuleOp moduleOp)


style: it's useful to keep functions inline unless it helps readability when the class is defined in the same file - it's fewer lines of code to have this up there and also harder to lose things (you declare StreamTensorOpUpdater(); as well but never define it, which is harder to see because it's out-of-line)

Signed-off-by: hanhanW <hanhan0912@gmail.com>

…ree-org#19975) The returned function (i.e., `ResolveLayoutAttrFn`) can be very inefficient because there could be other data-flow analysis in a run. The revision updates the `ResolveLayoutAttrFn` API. Now it accepts a list of query, and it stores the results to the map of `SetVector<Attribute>`. In the encoding specialization pass, it introduces `StreamTensorOpUpdater` class. There are two phases in the updater. The class caches all the queries in `init()`, and updates all the encodings in `run()`. The `init` method is introduced because there could be a failure in the initialization. In this context, we do not put them to the constructor because we can not signal the error in constructors. See https://google.github.io/styleguide/cppguide.html#Doing_Work_in_Constructors The pass gets 440x speed-up for one of SDXL compilation. The lit test configuration change (i.e., `--pass-pipeline='builtin.module(iree-stream-specialize-encodings)'`) is needed because we want to validate failures for unsupported encodings. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>

hanhanW requested a review from benvanik as a code owner February 12, 2025 17:26

benvanik approved these changes Feb 12, 2025

View reviewed changes

address comments

630e6af

Signed-off-by: hanhanW <hanhan0912@gmail.com>

hanhanW enabled auto-merge (squash) February 12, 2025 18:06

hanhanW merged commit d11b876 into iree-org:main Feb 12, 2025
38 of 39 checks passed

hanhanW deleted the improve-specialize-encoding-pass-compilation-time-2 branch February 12, 2025 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stream] Enable batch affinity queries in SpecializeEncoding pass. #19975

[Stream] Enable batch affinity queries in SpecializeEncoding pass. #19975

hanhanW commented Feb 12, 2025

benvanik left a comment

benvanik Feb 12, 2025

hanhanW Feb 12, 2025

benvanik Feb 12, 2025

hanhanW Feb 12, 2025

hanhanW Feb 12, 2025

benvanik Feb 12, 2025

hanhanW Feb 12, 2025

[Stream] Enable batch affinity queries in SpecializeEncoding pass. #19975

[Stream] Enable batch affinity queries in SpecializeEncoding pass. #19975

Conversation

hanhanW commented Feb 12, 2025

benvanik left a comment

Choose a reason for hiding this comment

benvanik Feb 12, 2025

Choose a reason for hiding this comment

hanhanW Feb 12, 2025

Choose a reason for hiding this comment

benvanik Feb 12, 2025

Choose a reason for hiding this comment

hanhanW Feb 12, 2025

Choose a reason for hiding this comment

hanhanW Feb 12, 2025

Choose a reason for hiding this comment

benvanik Feb 12, 2025

Choose a reason for hiding this comment

hanhanW Feb 12, 2025

Choose a reason for hiding this comment