Using Nsight to understand the warp lockstep property in CUDA