Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream preview #157

Open
wants to merge 25 commits into
base: master
Choose a base branch
from
Open

Stream preview #157

wants to merge 25 commits into from

Conversation

longnguyen2004
Copy link
Collaborator

Only took 2 months since #133 was made...

Should keyframe extraction be in this repo though, or it should be in https://github.com/Discord-RE/FFmpeg-input-premade?

Copy link

pkg-pr-new bot commented Feb 26, 2025

Open in Stackblitz

npm i https://pkg.pr.new/Discord-RE/Discord-video-stream/@dank074/discord-video-stream@157

commit: b66da92

@aiko-chan-ai
Copy link

aiko-chan-ai commented Feb 26, 2025

The function to set stream preview is now available in the library
https://github.com/aiko-chan-ai/discord.js-selfbot-v13/blob/main/src/structures/VoiceState.js#L317
Example:

guild.members.me.voice.postPreview(base64img)  

To be more precise, it has been available for several versions already

@longnguyen2004
Copy link
Collaborator Author

Does it work with this library handling the voice connection? If yes then I'll close

@aiko-chan-ai
Copy link

aiko-chan-ai commented Feb 26, 2025

It will work with this library, but it requires users to clearly specify which server's voice channel they are connecting to (for example, if they are connecting to server A, they will need to retrieve their own VoiceState). Alternatively, you can use this library's APIRequest to create a more secure request (there might be a typing error for ts, I will fix it to be a public property later)
And capturing a frame from the video is still very important

@longnguyen2004
Copy link
Collaborator Author

longnguyen2004 commented Feb 27, 2025

Alright so we have a bit of a problem regarding keyframe extraction. There are 3 options, with varying level of annoyance

  • In-process decoding: Requires no changes from user side, but since upstream libav.js doesn't include decoders for H.26x, we'll need to build our own version and publish it on npm (might be annoying but probably doable). Speed is also a potential issue, and we might need to bring back multithreading (yet again...) and enable SIMD
  • Extra keyframe video stream inside the existing mkv stream: Not supported by fluent-ffmpeg (only 1 audio stream and 1 video stream per output)
  • Extra mkv stream containing the keyframe stream: Requires 2 demuxer instances (eww), pollutes all the API since you need to pass this keyframe stream around

Since option 1 changes nothing API-wise, I'll try to build libav.js and see how difficult it is to work with Emscripten

@longnguyen2004
Copy link
Collaborator Author

Not too hard I'd say https://www.npmjs.com/package/@lng2004/libav.js-variant-webcodecs-avf-with-decoders
Now for the speed test...

@dank074
Copy link
Member

dank074 commented Feb 27, 2025

A little late since you've already done it with libav.js

Extra keyframe video stream inside the existing mkv stream: Not supported by fluent-ffmpeg (only 1 audio stream and 1 video stream per output)

You can also use https://github.com/dank074/fluent-ffmpeg-multistream-ts for multiple outputs/inputs

@longnguyen2004
Copy link
Collaborator Author

That is option 3 above (extra mkv stream containing the decoded keyframes). I wanted to avoid it since that means running 2 demuxer instances, which I don't like, and it also requires us to pass an extra stream around, which is not ideal.

@dank074
Copy link
Member

dank074 commented Feb 27, 2025

You're right, that's actually option 3, sorry misunderstood at first glance

@longnguyen2004
Copy link
Collaborator Author

longnguyen2004 commented Feb 28, 2025

It works!

{592A0D3F-BC99-4AE1-8A2E-A82FB50DD8E6}

For performance: Decoding, resizing and encode to jpg a 1920x1080 H.264/5 keyframe on an i3-330M takes ~1s, and increases the CPU usage to 30%, which is actually better than I expected, considering that there are no SIMD optimizations. Since we only decode once every 5s (which is overkill even, since Discord doesn't load the preview that often), this is acceptable for me.

@longnguyen2004
Copy link
Collaborator Author

I love Deno and Bun

{862D427B-AA1C-4072-B0D4-6D52610B817B}
{CE9786F7-7975-416D-AE2C-717EA0E26E8F}

@longnguyen2004
Copy link
Collaborator Author

!!!!! memory leak

{9A932358-C8B5-490B-A83B-78167532FC6A}

@longnguyen2004
Copy link
Collaborator Author

Leak is gone, I used the wrong free function... (note to self: if there's a function x and a function x_js, it's for a good reason, use the right one)

@longnguyen2004
Copy link
Collaborator Author

longnguyen2004 commented Mar 1, 2025

With O3, decoding a somewhat complex H.264 keyframe at 1080p 10000kbps takes ~200ms on an i3-330M, which is more than acceptable. Decoding interval can be increased further since Discord doesn't update preview often. Base memory usage is ~500MB for 1920x1080, higher quality will obviously consume more.

Again, big disclaimer that this doesn't work on Deno or Bun, and trying to enable it anyway will result in catastrophic failure.

@longnguyen2004 longnguyen2004 marked this pull request as ready for review March 1, 2025 07:37
Copy link
Member

@dank074 dank074 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delay, somehow my comment stayed as pending and was never actually submitted

Just one comment

const { streamKey } = this.voiceConnection.streamConnection;
const data = `data:image/jpeg;base64,${image.toString("base64")}`;

await fetch(`https://discord.com/api/v9/streams/${streamKey}/preview`, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you find that aiko's library postPreview or APIRequest wouldn't work here? I think it's a good idea to try to stick to it as much as possible to prevent getting possibly detected by Discord anti bot measures since the HTTP headers between requests might potentially be different (e.g. user-agent)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it exported yet? I'll have a look later

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code snippet

const voiceState = client.guilds.cache.get(server_id).members.me.voice;
await voiceState.postPreview('base64image');
// If error code is USER_NOT_STREAMING, try setting the property 'streaming' to true
// If this error occurs, it may be because the VOICE_STATE_UPDATE event is not cached properly
voiceState.streaming = true;
await voiceState.postPreview('base64image');

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be simpler to just do

await this.client.user.voice.postPreview(data);

since users can't join multiple voice channels anyway (bots can, as long as they're in separate servers)

Copy link
Collaborator Author

@longnguyen2004 longnguyen2004 Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we expose the APIRequest thing for now? It's kinda weird for us to re-query information that we already have (but then again we already kinda did that in some ways so I'll try the suggestion above)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be simpler to just do

await this.client.user.voice.postPreview(data);

since users can't join multiple voice channels anyway (bots can, as long as they're in separate servers)

Users can also join multiple voice channels across different servers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants