-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Stream] Enable batch affinity queries in SpecializeEncoding pass. #19975
[Stream] Enable batch affinity queries in SpecializeEncoding pass. #19975
Conversation
The returned function (i.e., ResolveLayoutAttrFn) can be very inefficient because there could be other data-flow analysis in a run. The revision updates the ResolveLayoutAttrFn API. Now it accepts a list of query, and it stores the results to the map of SetVector<Attribute>. In the encoding specialization pass, it introduces `StreamTensorOpUpdater` class. There are two phases in the updater. The class caches all the queries in `init()`, and updates all the encodings in `run()`. The `init` method is introduced because there could be a failure in the initialization. In this context, we do not put them to the constructor because we can not signal the error in constructors. See https://google.github.io/styleguide/cppguide.html#Doing_Work_in_Constructors The pass gets 440x speedup for one of SDXL compilation. Signed-off-by: hanhanW <hanhan0912@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor style nits and otherwise lgtm!
using ResolveLayoutAttrFn = std::function<LogicalResult( | ||
AffinityAttr, Operation *, SetVector<Attribute> &)>; | ||
ArrayRef<AffinityAndOpPair>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: you can add argument names here to help readability and intellisense (e.g. std::function<void(int a, int b, int c)>
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks!
// encodings. See StreamInterfaces.h for more details. | ||
IREE::Stream::ResolveLayoutAttrFn resolveLayoutAttr; | ||
}; | ||
} // namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: balanced whitespace
(but you may already be in an anonymous namespace from line 36 and not need this at all - it's hard to spot because there's no whitespace after it and it blends in with the following function :)
} // namespace | |
} // namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with anonymous namespaces is that they naturally want to encourage indentation of their body, and they reduce locality of reference: if you see a random function definition in a C++ file, it is easy to see if it is marked static, but seeing if it is in an anonymous namespace requires scanning a big chunk of the file.
Because of this, we have a simple guideline: make anonymous namespaces as small as possible, and only use them for class declarations.
I agree that it's hard to spot, so I really like the LLVM style guide idea. Let me update the namespace to only scope these classes. The local methods already have static
keywords, and they do not need to be within an anonymous namespace.
https://llvm.org/docs/CodingStandards.html#anonymous-namespaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention that I fixed the namespace scope in the commit.
}; | ||
} // namespace | ||
|
||
StreamTensorOpUpdater::StreamTensorOpUpdater(ModuleOp moduleOp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: it's useful to keep functions inline unless it helps readability when the class is defined in the same file - it's fewer lines of code to have this up there and also harder to lose things (you declare StreamTensorOpUpdater();
as well but never define it, which is harder to see because it's out-of-line)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: hanhanW <hanhan0912@gmail.com>
…ree-org#19975) The returned function (i.e., `ResolveLayoutAttrFn`) can be very inefficient because there could be other data-flow analysis in a run. The revision updates the `ResolveLayoutAttrFn` API. Now it accepts a list of query, and it stores the results to the map of `SetVector<Attribute>`. In the encoding specialization pass, it introduces `StreamTensorOpUpdater` class. There are two phases in the updater. The class caches all the queries in `init()`, and updates all the encodings in `run()`. The `init` method is introduced because there could be a failure in the initialization. In this context, we do not put them to the constructor because we can not signal the error in constructors. See https://google.github.io/styleguide/cppguide.html#Doing_Work_in_Constructors The pass gets 440x speed-up for one of SDXL compilation. The lit test configuration change (i.e., `--pass-pipeline='builtin.module(iree-stream-specialize-encodings)'`) is needed because we want to validate failures for unsupported encodings. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>
The returned function (i.e.,
ResolveLayoutAttrFn
) can be very inefficient because there could be other data-flow analysis in a run. The revision updates theResolveLayoutAttrFn
API. Now it accepts a list of query, and it stores the results to the map ofSetVector<Attribute>
.In the encoding specialization pass, it introduces
StreamTensorOpUpdater
class. There are two phases in the updater. The class caches all the queries ininit()
, and updates all the encodings inrun()
. Theinit
method is introduced because there could be a failure in the initialization. In this context, we do not put them to the constructor because we can not signal the error in constructors. See https://google.github.io/styleguide/cppguide.html#Doing_Work_in_ConstructorsThe pass gets 440x speed-up for one of SDXL compilation.
The lit test configuration change (i.e.,
--pass-pipeline='builtin.module(iree-stream-specialize-encodings)'
) is needed because we want to validate failures for unsupported encodings.