Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor service recovery after restarts/deploys #1647

Closed
2 of 3 tasks
dyc3 opened this issue Apr 9, 2024 · 2 comments · Fixed by #1660
Closed
2 of 3 tasks

Poor service recovery after restarts/deploys #1647

dyc3 opened this issue Apr 9, 2024 · 2 comments · Fixed by #1660
Assignees
Labels
balancer Improvements or additions to the load balancer bug Something isn't working ui-ux/user Problem with the UI/UX from an end user perspective

Comments

@dyc3
Copy link
Owner

dyc3 commented Apr 9, 2024

Current Behavior

Sometimes, deploys result in clients not reconnecting correctly. Eventually the system recovers, assisted by manual page refreshes, and playback is not interrupted.

Expected Behavior

Service restarts and deploys should result in clients reconnecting to the room gracefully. Video playback should remain uninterrupted.

Steps To Reproduce

Precise steps unknown

  1. Watch a video on the staging branch
  2. Have more than 1 user in the room
  3. While the video is playing, perform a deploy
  4. Note whether or not all clients reconnect and end up in the same instance of the room.

Environment

  • This happens on the official site, opentogethertube.com
  • This happens using a self-hosted version.
  • I'm using the docker image.

Room name or URL

No response

Video URL

No response

Anything else?

No response

@dyc3 dyc3 added bug Something isn't working ui-ux/user Problem with the UI/UX from an end user perspective balancer Improvements or additions to the load balancer labels Apr 9, 2024
@dyc3 dyc3 added this to the Load Balancing milestone Apr 9, 2024
@dyc3 dyc3 self-assigned this Apr 9, 2024
@dyc3
Copy link
Owner Author

dyc3 commented Apr 9, 2024

I've found that this only occurs when the monolith is being deployed.

@dyc3 dyc3 moved this from Todo to In Progress in OTT Horizontal Scaling Apr 9, 2024
@dyc3
Copy link
Owner Author

dyc3 commented Apr 9, 2024

I'm seeing these logs:

2024-04-09T21:57:42Z app[6e824031ce2087] ewr [info]2024-04-09T21:57:42.270644Z ERROR ott_balancer::balancer: failed to handle client inbound: channel closed
2024-04-09T21:57:42Z app[6e824031ce2087] ewr [info]2024-04-09T21:57:42.270663Z  INFO ott_balancer::balancer: monolith disconnected, stopping client inbound handler
2024-04-09T21:57:43Z app[6e824031ce2087] ewr [info]2024-04-09T21:57:43.944497Z ERROR client_entry{room_name=RoomName("5b333992-dc12-45cd-af60-73d963d94e2e") client_id="5986a053-047f-4192-b8e1-0eda70d45fc0"}: ott_balancer::client: Error sending client message to balancer: channel closed
2024-04-09T21:57:43Z app[6e824031ce2087] ewr [info]2024-04-09T21:57:43.944525Z  INFO client_entry{room_name=RoomName("5b333992-dc12-45cd-af60-73d963d94e2e") client_id="5986a053-047f-4192-b8e1-0eda70d45fc0"}: ott_balancer::client: ending client connection
2024-04-09T21:57:43Z app[6e824031ce2087] ewr [info]2024-04-09T21:57:43.944591Z ERROR ott_balancer::service: Error in websocket connection: channel closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
balancer Improvements or additions to the load balancer bug Something isn't working ui-ux/user Problem with the UI/UX from an end user perspective
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant