Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove use of AVCodecPtr in interface definition #453

Merged
merged 5 commits into from
Jan 22, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/torchcodec/decoders/_core/CPUOnlyDevice.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ void releaseContextOnCuda(
throwUnsupportedDeviceError(device);
}

std::optional<AVCodecPtr> findCudaCodec(
std::optional<const AVCodec*> findCudaCodec(
const torch::Device& device,
[[maybe_unused]] const AVCodecID& codecId) {
throwUnsupportedDeviceError(device);
Expand Down
2 changes: 1 addition & 1 deletion src/torchcodec/decoders/_core/DeviceInterface.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ void releaseContextOnCuda(
const torch::Device& device,
AVCodecContext* codecContext);

std::optional<AVCodecPtr> findCudaCodec(
std::optional<const AVCodec*> findCudaCodec(
const torch::Device& device,
const AVCodecID& codecId);

Expand Down
8 changes: 8 additions & 0 deletions src/torchcodec/decoders/_core/FFMPEGCommon.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@

namespace facebook::torchcodec {

AVCodecPtrBestStream makeAVCodecPtrBestStream(const AVCodec* codec) {
#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(59, 18, 100)
return const_cast<AVCodec*>(codec);
#else
return codec;
#endif
}

std::string getFFMPEGErrorStringFromErrorCode(int errorCode) {
char errorBuffer[AV_ERROR_MAX_STRING_SIZE] = {0};
av_strerror(errorCode, errorBuffer, AV_ERROR_MAX_STRING_SIZE);
Expand Down
8 changes: 6 additions & 2 deletions src/torchcodec/decoders/_core/FFMPEGCommon.h
Original file line number Diff line number Diff line change
Expand Up @@ -75,12 +75,16 @@ using UniqueSwsContext =
// which was released in FFMPEG version=5.0.3
// with libavcodec's version=59.18.100
// (https://www.ffmpeg.org/olddownload.html).
// Note that the alias is so-named so that it is only used when interacting with
// av_find_best_stream(). It is not needed elsewhere.
#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(59, 18, 100)
using AVCodecPtr = AVCodec*;
using AVCodecPtrBestStream = AVCodec*;
#else
using AVCodecPtr = const AVCodec*;
using AVCodecPtrBestStream = const AVCodec*;
#endif

AVCodecPtrBestStream makeAVCodecPtrBestStream(const AVCodec* codec);

// Success code from FFMPEG is just a 0. We define it to make the code more
// readable.
const int AVSUCCESS = 0;
Expand Down
9 changes: 5 additions & 4 deletions src/torchcodec/decoders/_core/VideoDecoder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -426,7 +426,7 @@ void VideoDecoder::createFilterGraph(
}

int VideoDecoder::getBestStreamIndex(AVMediaType mediaType) {
AVCodecPtr codec = nullptr;
AVCodecPtrBestStream codec = nullptr;
int streamNumber =
av_find_best_stream(formatContext_.get(), mediaType, -1, -1, &codec, 0);
return streamNumber;
Expand All @@ -441,7 +441,7 @@ void VideoDecoder::addVideoStreamDecoder(
" is already active.");
}
TORCH_CHECK(formatContext_.get() != nullptr);
AVCodecPtr codec = nullptr;
AVCodecPtrBestStream codec = nullptr;
Copy link
Member

@NicolasHug NicolasHug Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear that the name AVCodecPtrBestStream may lead to some confusion. It's true that we only use it when interacting with av_find_best_stream, but weirdly enough, av_find_best_stream's purpose isn't just to automatically detect the "best" stream (in the sense that we would normally think about it).

We pass preferredStreamNumber to av_find_best_stream, and av_find_best_stream will only run its "find the best stream" heuristic when preferredStreamNumber is -1. When we call it with e.g. 3, it will return the stream with index 3 (which may not be the "best" stream). So AVCodecPtrBestStream isn't necessarily the codec of the best stream either.

(docs and source)


Note: While the av_find_best_stream name is a bit misleading, I think we should still call it, perhaps with a comment to disambiguate.: internally it does some work that we would have to be doing manually if we didn't call it, particularly it sets the codec value by calling avcodec_find_decoder().


I've renamed the alias to make it obvious that we only should use it for when we're calling av_find_best_stream(); it's not a general alias for all uses of AVCodec*

I'm curious to learn more about the concern here. My current understanding (as per my other comment) is that we want to avoid implicit casts, so maybe keeping the existing AVCodecPtr name is OK - but I might be missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention is to rename AVCodecPtr to what is morally AVCodecPtrOnlyUseThisTypeWhenCallingAVFormatBestStream. Hence I appended BestStream; my intention was not to say "This is the codec for the best stream," but rather "This is the AVCodec pointer type to use when calling av_find_best_stream()."

My concern, which is kinda what happened, is that AVCodecPtr looks general enough that someone will want to use it as a general "I need a pointer to an AVCodec" type. But it's not that. It's a type which allows us to dance around the awkwardness that the function av_find_best_stream() had a breaking API change after version 4.

Implicit casts aren't a problem, per se - the compiler will prevent them. The problem is overuse of AVCodecPtr. I want to give it an ugly name that is awkward to use to indicate that it's not a general type.

int streamNumber = av_find_best_stream(
formatContext_.get(),
AVMEDIA_TYPE_VIDEO,
Expand All @@ -464,8 +464,9 @@ void VideoDecoder::addVideoStreamDecoder(
}

if (options.device.type() == torch::kCUDA) {
codec = findCudaCodec(options.device, streamInfo.stream->codecpar->codec_id)
.value_or(codec);
codec = makeAVCodecPtrBestStream(
findCudaCodec(options.device, streamInfo.stream->codecpar->codec_id)
.value_or(codec));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check my understanding: the main point of this PR is that previously, findCudaCodec would always return a const AVCodec*, for all FFmpeg versions. This const AVCodec* would be cast to a AVCodecPtr, and on some FFmpeg versions that would implicitly become a non-const AVCodec*? By calling makeAVCodecPtrBestStream, now we're making this cast explicit?

Copy link
Contributor Author

@scotts scotts Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, no, previously findCudaCodec() returned an AVCodecPtr. But inside of findCudaCodec() we were actually looking for and finding what was always a const AVCodec*. The FFmpeg functions we're calling in that function always take and return a const AVCodec*. That caused a problem with FFmpeg version 4, because in that version, AVCodecPtr is just a AVCodec*. The problem is we would try to assign a const pointer to a non-const pointer, which is not an allowed implicit conversion. The makeAVCodecPtrBestStream() utility does the appropriate const-cast when needed.

}

AVCodecContext* codecContext = avcodec_alloc_context3(codec);
Expand Down
Loading