Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove use of AVCodecPtr in interface definition #453

Merged
merged 5 commits into from
Jan 22, 2025

Conversation

scotts
Copy link
Contributor

@scotts scotts commented Jan 17, 2025

This somehow slipped through the checking in #452.

What we used to call AVCodecPtr and I've now renamed to AVCodecPtrBestStream is an alias for AVCodec* or const AVCodec* depending on which FFmpeg version we're compiling for. The const-requirements for this pointer changed across versions in the call av_find_best_stream(). I've renamed the alias to make it obvious that we only should use it for when we're calling av_find_best_stream(); it's not a general alias for all uses of AVCodec*.

Where this gets awkward is in findCudaCodec() where we call other functions that always expect a const AVCodec*. So I created a function to abstract whether or not we need to a const-cast. My principle here is all checking of FFmpeg versions should happen in FFMpegCommon.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 17, 2025
@scotts scotts requested a review from NicolasHug January 17, 2025 20:50
@scotts scotts marked this pull request as ready for review January 17, 2025 20:50
Comment on lines 467 to 469
codec = makeAVCodecPtrBestStream(
findCudaCodec(options.device, streamInfo.stream->codecpar->codec_id)
.value_or(codec));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check my understanding: the main point of this PR is that previously, findCudaCodec would always return a const AVCodec*, for all FFmpeg versions. This const AVCodec* would be cast to a AVCodecPtr, and on some FFmpeg versions that would implicitly become a non-const AVCodec*? By calling makeAVCodecPtrBestStream, now we're making this cast explicit?

Copy link
Contributor Author

@scotts scotts Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, no, previously findCudaCodec() returned an AVCodecPtr. But inside of findCudaCodec() we were actually looking for and finding what was always a const AVCodec*. The FFmpeg functions we're calling in that function always take and return a const AVCodec*. That caused a problem with FFmpeg version 4, because in that version, AVCodecPtr is just a AVCodec*. The problem is we would try to assign a const pointer to a non-const pointer, which is not an allowed implicit conversion. The makeAVCodecPtrBestStream() utility does the appropriate const-cast when needed.

@@ -441,7 +441,7 @@ void VideoDecoder::addVideoStreamDecoder(
" is already active.");
}
TORCH_CHECK(formatContext_.get() != nullptr);
AVCodecPtr codec = nullptr;
AVCodecPtrBestStream codec = nullptr;
Copy link
Member

@NicolasHug NicolasHug Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear that the name AVCodecPtrBestStream may lead to some confusion. It's true that we only use it when interacting with av_find_best_stream, but weirdly enough, av_find_best_stream's purpose isn't just to automatically detect the "best" stream (in the sense that we would normally think about it).

We pass preferredStreamNumber to av_find_best_stream, and av_find_best_stream will only run its "find the best stream" heuristic when preferredStreamNumber is -1. When we call it with e.g. 3, it will return the stream with index 3 (which may not be the "best" stream). So AVCodecPtrBestStream isn't necessarily the codec of the best stream either.

(docs and source)


Note: While the av_find_best_stream name is a bit misleading, I think we should still call it, perhaps with a comment to disambiguate.: internally it does some work that we would have to be doing manually if we didn't call it, particularly it sets the codec value by calling avcodec_find_decoder().


I've renamed the alias to make it obvious that we only should use it for when we're calling av_find_best_stream(); it's not a general alias for all uses of AVCodec*

I'm curious to learn more about the concern here. My current understanding (as per my other comment) is that we want to avoid implicit casts, so maybe keeping the existing AVCodecPtr name is OK - but I might be missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention is to rename AVCodecPtr to what is morally AVCodecPtrOnlyUseThisTypeWhenCallingAVFormatBestStream. Hence I appended BestStream; my intention was not to say "This is the codec for the best stream," but rather "This is the AVCodec pointer type to use when calling av_find_best_stream()."

My concern, which is kinda what happened, is that AVCodecPtr looks general enough that someone will want to use it as a general "I need a pointer to an AVCodec" type. But it's not that. It's a type which allows us to dance around the awkwardness that the function av_find_best_stream() had a breaking API change after version 4.

Implicit casts aren't a problem, per se - the compiler will prevent them. The problem is overuse of AVCodecPtr. I want to give it an ugly name that is awkward to use to indicate that it's not a general type.

@scotts
Copy link
Contributor Author

scotts commented Jan 22, 2025

@NicolasHug, I made the name even uglier.

I thought about abstracting the call itself, and I felt that would make things worse. We really need to two values from the call to av_find_best_stream(): the stream number and the codec pointer. The codec pointer must exist before the call, because av_find_best_stream() fills it in. We could just create a struct for returning it for both, but that feels just as clumsy as what we're currently doing.

And, as I understand the FFmpeg API more, av_find_best_stream() is morally a method on the AVFormatContext object. If we were to start building abstractions for these things, that's what I'd want to build it around. We would create an actual C++ object that contains the AVFormatContext pointer, rather than just carrying around a unique pointer. The call to av_find_best_stream() would then be a method on that object. It may still be worth doing that, but I'd prefer to do that later and merge this to fix FFmpeg 4 builds.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for trying the alternative of abstracting the call @scotts , I can see now with your explanation that it is indeed clumsier than I expected. I love the new names :D !

@NicolasHug NicolasHug merged commit 64d1b0d into pytorch:main Jan 22, 2025
44 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants