Cuda Exercise two: timing issues
The goal of the exercise is to understand timing of your cuda codes.
Here is a mulvec cuda program.
- mulvec essentially computes C[i] = A[i]*B[i]. Run it and time it.
- Now create a new program macvec that computes C[i] = C[i] + A[i]*B[i].
You need to pass in C, both for warm up and for the timed run.
- Run and time macvec. Is there something strange here?
Can you explain it?