-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(clustering/sync): improve error handling #14196
Conversation
11f26ca
to
3ecd041
Compare
-- retry without releasing the mutex as the DP has nothing to do except syncing | ||
if is_full_sync then | ||
yield() | ||
return do_sync() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we run sync in a timer? could it cause more cost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By doing so we do not release the lock and fetch the lock again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential infinite loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand in this state, there is nothing DP can do other than syncing, but it is a little bit risky to sync again and again without any backoff delay.
This PR may conflict with another refactor PR? What is your opinion? |
691f8a3
to
5abb6fd
Compare
Resolved |
local do_sync | ||
|
||
|
||
local function error_handler(err, is_full_sync) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this error_handler()
is a premature optimization, we could have a simple function like:
try_sync(is_full_sync)
if is_full_sync then
yield()
return do_sync()
end
end
And keep the oringinal error handling code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes an unconditional yielding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be
if not res then
if is_full_sync then
return try_sync()
end
return nil, err
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for sync I don't think that we should hide this operation in an error handler function, we should show it clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will show you why this is designed like this in EE's PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway this change is removed
|
||
if not deltas then | ||
return nil, "sync get_delta error: deltas is null" | ||
return error_handler("sync get_delta error: deltas is null", is_full_sync) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need some comments about when and why we should run do_sync
again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you commenting at the place you meant to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I don't understand why here we should call do_sync again, what's the difference between other error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
5abb6fd
to
1e49782
Compare
1e49782
to
cdbd77e
Compare
local do_sync | ||
|
||
|
||
local function error_handler(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have removed some code in this function, I don't think it is necessary now (only two lines).,
@@ -177,7 +177,16 @@ local function is_rpc_ready() | |||
end | |||
|
|||
|
|||
local function do_sync() | |||
local do_sync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this forward declaration should be removed.
@chronolaw Let's close this PR then |
Summary
Checklist
changelog/unreleased/kong
orskip-changelog
label added on PR if changelog is unnecessary. README.mdIssue reference
KAG-6225