Performance Tuning

Improving Start Time

  1. Compiling CUDA kernels to exact compute capability of device reduces jit compile time.