Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PATCH v2] test: timer_accuracy: introduce "concurrency" mode #2180

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

JereLeppanen
Copy link
Collaborator

No description provided.

Remove unnecessary setting of option values to zero when they have
already been zeroed.

Signed-off-by: Jere Leppänen <jere.leppanen@nokia.com>
…ime mode

Set min_tmo to period / 10 only for one-shot absolute time mode, as it
requires a min_tmo smaller than the period. For all other modes, set
min_tmo to the same value as the period.

Signed-off-by: Jere Leppänen <jere.leppanen@nokia.com>
…options

Clearly separate mode-specific command line options from generic
options in the help text.

Signed-off-by: Jere Leppänen <jere.leppanen@nokia.com>
@odpbuild odpbuild changed the title test: timer_accuracy: introduce "concurrency" mode [PATCH v1] test: timer_accuracy: introduce "concurrency" mode Jan 29, 2025
@@ -651,7 +701,7 @@ static int create_timers(test_global_t *test_global)
printf(" resolution: %" PRIu64 " nsec\n", timer_param.res_ns);
}

timer_param.num_timers = alloc_timers;
timer_param.num_timers = alloc_timers * 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The odp_timer_capability_t.max_timers check above looks to be now invalid. Also, isn't this necessary only for concurrency mode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and it's not *2, the extra number of timers needed is actually equal to cpu_count.

@@ -591,7 +640,8 @@ static int create_timers(test_global_t *test_global)

odp_pool_param_init(&pool_param);
pool_param.type = ODP_POOL_TIMEOUT;
pool_param.tmo.num = alloc_timers;
pool_param.tmo.num =
alloc_timers * 2 + pool_param.tmo.cache_size * test_global->opt.cpu_count;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 * necessary only for concurrency mode?

uint64_t starts = ctx->starts;

if (events > starts)
ODPH_ERR("ctx %p timer %p time %" PRIu64 " starts %" PRIu64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this suggest duplicate events? If so, probably best just to abort/exit right away so this won't go unnoticed even in CI.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In v2, added an option to exit with _exit().


if (test_global->opt.max_diff &&
diff > (int64_t)test_global->opt.max_diff) {
ODPH_ERR("ctx %p timer %p time %" PRIu64 " diff %" PRIi64 "\n", ctx,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also count the number of events exceeding max_diff and print this later in print_stat().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_exit() also here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright can be updated.

Comment on lines 1349 to 1351
if (test_global->opt.cancel_start &&
!(events % test_global->opt.cancel_start)) {
/*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing, but this could be saved to a const variable as the same condition is checked again below. Perhaps same could be done to all feature conditions back-to-back for readability.

odp_timeout_t timeout =
odp_timeout_alloc(test_global->timeout_pool);
if (timeout == ODP_TIMEOUT_INVALID) {
printf("Timeout alloc failed\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ODPH_ERR()

@@ -1449,6 +1642,41 @@ int main(int argc, char *argv[])
odp_time_wait_ns(10 * ODP_TIME_MSEC_IN_NS);

cancel_periodic_timers(test_global);
} else if (test_global->opt.mode == MODE_CONCURRENCY) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be split into a separate function.

Comment on lines 1674 to 1676
printf("msec %" PRIu64 " total events %" PRIu64 " events %" PRIu64
" events/s %" PRIu64 " stopped %d\n",
msec_total, events, e, e * 1000 / msec, stopped);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use seconds here as the print interval is always in seconds.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not exactly seconds, but changed anyway.

* interval.
*/
timer_ctx_t *ctx = &test_global->timer_ctx[i];
uint64_t ctx_events = ctx->events;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a possible data race. Is this information that necessary to print anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely a data race. It's avoiding synchronization in order to have minimal effect on the test itself. The prints have been very useful in recent testing, especially when some timers stop generating events, some periodic check like this is needed to see that something stopped from happening.

On the other hand perhaps it's better to add prints ad hoc, when needed to investigate a specific problem. So this could just be removed.

For reporting errors, use ODPH_ERR() instead of printf().

Signed-off-by: Jere Leppänen <jere.leppanen@nokia.com>
Introduce a new "concurrency" mode that performs timer starts and
cancels across multiple threads. This mode aims to uncover concurrency
issues in the timer implementation, particularly with a short period,
where timer expirations, starts, and cancels frequently involve the
same timer bucket.

Signed-off-by: Jere Leppänen <jere.leppanen@nokia.com>
Run all timer_accuracy modes, not just the default.

Signed-off-by: Jere Leppänen <jere.leppanen@nokia.com>
@JereLeppanen JereLeppanen force-pushed the dev/test-timer-concurrency branch from bea0d63 to 2dae501 Compare February 7, 2025 12:19
@odpbuild odpbuild changed the title [PATCH v1] test: timer_accuracy: introduce "concurrency" mode [PATCH v2] test: timer_accuracy: introduce "concurrency" mode Feb 7, 2025
@JereLeppanen
Copy link
Collaborator Author

v2:

  • Prcess review comments.
  • Add commit test: timer_accuracy: report errors with ODPH_ERR()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants