Confusing input/output copies #1

lahwaacz · 2019-07-04T19:43:08Z

Hi Carl,

I think there are somewhat confused copies for the input and output data for the benchmark. Since the kernel computes a = b + scalar * c, the vectors b and c should be copied from host to device and the vector a should be copied back from device to host. See here and here.

Most importantly from the performance point of view, you're making one unnecessary copy from host to device, which does not happen in the zero-copy and unified memory variants, so the comparison with pageable and pinned memory is not completely fair.

Jakub

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusing input/output copies #1

Confusing input/output copies #1

lahwaacz commented Jul 4, 2019

Confusing input/output copies #1

Confusing input/output copies #1

Comments

lahwaacz commented Jul 4, 2019