Chapter 11: Testing, Debugging, and Benchmarking GPU Kernels

This chapter covers strategies for verifying correctness, diagnosing failures, and measuring performance of OpenCL kernel code developed on top of nmathopencl.

Correctness testing

Because every kernel wrapper contains a CPU fallback path, the most reliable testing strategy compares the OpenCL output against the CPU reference output on the same inputs. Standard R unit-test frameworks (testthat, tinytest) work directly — write tests that call the wrapper function and assert numerical agreement within an appropriate tolerance (typically .Machine$double.eps^0.5 for double-precision kernels).

Key points:

Debugging kernel failures

When a kernel fails to compile or execute, the OpenCL runtime reports an error code. nmathopencl propagates these as R errors via stop(). Common causes:

Benchmarking

Use bench::mark() or microbenchmark::microbenchmark() to compare the GPU path against the CPU fallback. A few guidelines: