-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make backfill behavior more explicit #700
Conversation
} | ||
|
||
fn log_ascii_warning(version: u64) { | ||
println!(r#" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was fun haha
/// If it is a regular processor, this will return the higher of the checkpointed version, | ||
/// or `staring_version` from the config, or 0 if not set. | ||
/// If it is a regular (`mode == default`) processor, this will return the higher of the (checkpointed version, | ||
/// `initial_starting_version` || 0) from the bootstrap config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: what purpose does intial_starting_version serve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's the same purpose as it is today, i just prepended it with initial
to make it explicit that it'll only be respected on initial startup and get_starting_version()
will take the larger of the config's starting version and the checkpointed version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the updated readme?
had some questions. once confirmed should be g2g. thanks for addressing this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a few comments but devex lgtm! would you mind also updating the README.md?
- `backfill_id`: appended to `processor_type` for a unique backfill identifier | ||
- `initial_starting_version`: processor starts here unless there is a greater checkpointed version | ||
- `ending_version`: ending version of the backfill | ||
- `overwrite_checkpoint`: overwrite checkpoints if it exists, restarting the backfill from `initial_starting_version`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this lead to crashlooping behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we either have crash looping when someone sets overwrite_checkpoint = true
, or we don't have a way to in progress backfills. I think if someone needs to restart an in-progress BF, they're able to pay attention and fix the wrong behavior.
What do you think?
/// the checkpointed version + 1. | ||
/// | ||
/// If this is a backfill processor and there is not an in-progress backfill (i.e., no checkpoint or | ||
/// backfill status is COMPLETE), this will return `starting_version` from the config, or 0 if not set. | ||
/// backfill status is COMPLETE), this will return `intial_starting_version` from the backfill config, or 0 if not set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return
intial_starting_version from the backfill config, or 0 if not set.
isn't intial_starting_version
field required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left minor questions, overall lgtm!
rust/sdk-processor/README.md
Outdated
backfill_config: | ||
backfill_id: "123" | ||
initial_starting_version: 42 | ||
ending_version: 100042 | ||
overwrite_checkpoint: false | ||
testing_config: | ||
override_starting_version: 111111 | ||
ending_version: 222222 | ||
bootstrap_config: | ||
initial_starting_version: 123456789 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are only 1 of these configs set at a time? if so, might be confusing to include all three of these in the template. could we also add comments here for when to use each mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to not include these in the example config but mention them in the explanation section:
-
backfill_config
(optional)backfill_id
: appended toprocessor_type
for a unique backfill identifierinitial_starting_version
: processor starts here unless there is a greater checkpointed versionending_version
: ending version of the backfilloverwrite_checkpoint
: overwrite checkpoints if it exists, restarting the backfill frominitial_starting_version
.
-
testing_config
(optional)override_starting_version
: starting version of the testing. always starts from this versionending_version
: ending version of the testing
-
bootstrap_config
(optional) used for regular, non-backfill, processorsinitial_starting_version
: processor starts here unless there is a greater checkpointed version.
-
mode
: (optional)default
,testing
orbackfill
. Set todefault
if no mode specified. If backfill/testing/bootstrap
configs are not specified, processor will start from 0 or the last successfully processed version.
Purpose
Make restart behavior more explicit through the introduction of
TestingConfig
,BootstrapConfig
andProcessorMode
onIndexerProcessorConfig
. Thestarting_version
ontransaction_stream_config
should no longer be used, instead there are new fieldsinitial_starting_version
on the backfill + bootstrap configs, andoverride_starting_version
on testing config.Behavior
The current live behavior of taking the higher value of
transaction_stream_config.starting_version
is moved tobootstrap_config.initial_starting_version
. This is the default behavior of the processor ifProcessorMode
is not specified. Just as before, if the respective starting version is not provided, the processor will begin from genesis.BackfillConfig.intial_starting_version
will be respected when any of the following are true:backfill_processor_status
last_successful_version >= backfill_config.ending_version
)initial_starting_version > last_successful_version
.backfill_config.overwrite_checkpoint
is set totrue
Otherwise, the processor will select
last_successful_version
.Technical Bits
Using pattern matching on
IndexerProcessorConfig.processor_mode
for different logic instarting_version.rs
. All new fields are Optional using Serde to defaultprocessor_mode
to default when not provided. Serde will throw ifmode == testing
ormode == backfill
and the respective sub-config isn't provided.Make start and end version on
backfill_config
required. Removebackfill_alias
field in favor ofbackfill_id
. This will be appended onto the processor type as a PK in thebackfill_processor_status
table.Testing
Added tests for new behavior
Mode not set, no checkpoint
Mode not set, checkpoint higher than bootstrap start ver (ran after the above run)
mode = backfill with no checkpoint
mode = backfill, checkpoint higher than
backfill_config.intial_starting_version
mode = testing, checkpoint higher than
testing_config.override_starting_version
. latter is still picked