Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing input/output copies #1

Open
lahwaacz opened this issue Jul 4, 2019 · 0 comments
Open

Confusing input/output copies #1

lahwaacz opened this issue Jul 4, 2019 · 0 comments

Comments

@lahwaacz
Copy link

lahwaacz commented Jul 4, 2019

Hi Carl,

I think there are somewhat confused copies for the input and output data for the benchmark. Since the kernel computes a = b + scalar * c, the vectors b and c should be copied from host to device and the vector a should be copied back from device to host. See here and here.

Most importantly from the performance point of view, you're making one unnecessary copy from host to device, which does not happen in the zero-copy and unified memory variants, so the comparison with pageable and pinned memory is not completely fair.

Jakub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant