inst/cl/nmath/?The nmath/ subdirectory under inst/cl/
contains a complete OpenCL C port of R’s nmath (Mathlib)
library — approximately 180 .cl files, one per source file
in R’s src/nmath/. Together they implement the full suite
of statistical distribution functions (densities, CDFs, quantiles,
random-variate generators) and supporting mathematical helpers that make
up R’s statistical computing core.
The port is designed so that any downstream package can load a subset of this library — just the functions it actually needs — without pulling in the entire collection. The annotation scheme and dependency resolver (described in Chapter 08) make this automatic.
Every .cl file in nmath/ begins with a
standard metadata block:
// @source_type: c
// @source_origin: dnorm.c
// @includes: nmath.h, dpq.h
// @depends: nmath, dpq
// @provides: dnorm4
// @all_depends_count: 3
// @all_depends: dpq, Rmath, nmath
// @load_order: 36| Tag | Meaning |
|---|---|
@source_type |
Always c for nmath ports |
@source_origin |
The original C source file in R’s src/nmath/ |
@includes |
Headers included in the original C file |
@depends |
Direct file dependencies (.cl
filenames without extension) |
@provides |
Functions defined in this file |
@all_depends |
Full transitive closure of dependencies (union over the DAG) |
@all_depends_count |
Number of entries in @all_depends |
@load_order |
Pre-computed position in the topologically sorted order |
The @depends tag is what
opencltools::load_kernel_library() uses to sort files. The
@all_depends tag is informational — it documents the
complete dependency set so that a developer extracting a minimal subset
knows exactly which files to include.
The ~180 files are organized by function family. The table below shows representative entries for the main families.
| File | Provides |
|---|---|
nmath.cl |
Library header: constants, macros, error codes |
Rmath.cl |
Forward declarations for all Mathlib functions |
dpq.cl |
R_D__0, R_D__1, R_DT_0,
R_DT_1, R_D_Lval, … (log-prob helpers) |
refactored.cl |
Forward declarations for cycle-broken functions |
| File | Provides |
|---|---|
chebyshev.cl |
chebyshev_init, chebyshev_eval |
cospi.cl |
cospi, sinpi, tanpi |
fmax2.cl |
fmax2 |
fmin2.cl |
fmin2 |
gammalims.cl |
gammalims |
lgammacor.cl |
lgammacor |
lgamma.cl |
lgammafn, lgammafn_sign |
gamma.cl |
gammafn |
stirlerr.cl |
stirlerr |
bd0.cl |
bd0, ebd0 |
pgamma_utils.cl |
log1pmx, lgamma1p |
mlutils.cl |
log1pexp, log1mexp,
logspace_add, logspace_sub |
expm1.cl |
expm1 (if not an OpenCL built-in) |
log1p.cl |
log1p (if not an OpenCL built-in) |
d*)| File | Provides |
|---|---|
dnorm.cl |
dnorm4 |
dgamma.cl |
dgamma |
dbinom.cl |
dbinom_raw, dbinom |
dpois.cl |
dpois_raw, dpois |
dbeta.cl |
dbeta |
dt.cl |
dt |
df.cl |
df |
dnbinom.cl |
dnbinom, dnbinom_mu |
dchisq.cl |
dchisq |
dexp.cl |
dexp |
dweibull.cl |
dweibull |
dlnorm.cl |
dlnorm |
dlogis.cl |
dlogis |
dcauchy.cl |
dcauchy |
dunif.cl |
dunif |
dhyper.cl |
dhyper |
dgeom.cl |
dgeom |
dnt.cl |
dnt (noncentral t) |
dnf.cl |
dnf (noncentral F) |
dnchisq.cl |
dnchisq (noncentral chi-sq) |
dnbeta.cl |
dnbeta (noncentral beta) |
p*)Similar coverage for pnorm.cl, pgamma.cl,
pbinom.cl, ppois.cl, pbeta.cl,
pt.cl, pf.cl, etc. Each file ports the
corresponding p<dist>.c from R’s source.
q*)qnorm.cl, qgamma.cl,
qbinom.cl, qpois.cl, qbeta.cl,
qt.cl, qf.cl, etc. Discrete quantile functions
use a shared binary-search helper in
qDiscrete_search.cl.
r*)rnorm.cl, rgamma.cl,
rbinom.cl, rpois.cl, rbeta.cl,
rt.cl, runif.cl, etc. These files define the
device-side generator logic; the actual RNG seed and state management is
handled by the host-side kernel runner.
| File | Provides |
|---|---|
beta.cl |
beta (beta function) |
lbeta.cl |
lbeta |
choose.cl |
choose, lchoose |
polygamma.cl |
psigamma, digamma, trigamma,
… |
toms708.cl |
Incomplete beta (TOMS 708 algorithm) |
bessel.cl |
bessel_i, bessel_j, bessel_k,
bessel_y |
signrank.cl |
dsignrank, psignrank,
qsignrank, rsignrank |
wilcox.cl |
dwilcox, pwilcox, qwilcox,
rwilcox |
| File | Provides |
|---|---|
fprec.cl, fround.cl,
ftrunc.cl, fsign.cl |
Rounding/truncation |
imax2.cl, imin2.cl |
Integer min/max |
sign.cl |
sign |
d1mach.cl, i1mach.cl |
Machine constants (BLAS-era helpers) |
r_check_user_interrupt.cl |
Stub for interrupt checking |
sexp.cl, snorm.cl,
sunif.cl |
Alternative RNG implementations |
OpenCL compiles a program as a single translation unit: every function must be defined before it is called (no forward-declaration headers, no separate compilation). A handful of the nmath C files are mutually recursive in ways that C handles via separate compilation but OpenCL cannot.
nmathopencl resolves these cycles by splitting the
offending source file into two .cl files:
stirlerr.c -> stirlerr_cycle_free.cl (no forward references)
stirlerr_cycle_dependent.cl (calls the forward-declared part)
refactored.cl (forward declarations only)
refactored.cl contains extern-style forward
declarations so that the cycle-dependent portion can reference symbols
that appear later in the concatenated source. The @depends
tags ensure the three files are loaded in the correct order:
refactored -> stirlerr_cycle_free ->
stirlerr_cycle_dependent -> stirlerr.
Similarly, bessel_j and bessel_y are each
split into *_cycle_free.cl and
*_cycle_dependent.cl for the same reason.
ex_glmbayes_nmath/ subsetBecause the full nmath/ library is large (~180 files),
the package also ships a pre-curated minimal subset in
inst/cl/ex_glmbayes_nmath/ — exactly the 20 files needed by
the six glmbayes kernels:
Rmath.cl, nmath.cl, refactored.cl,
chebyshev.cl, cospi.cl, fmax2.cl, gammalims.cl, lgammacor.cl,
gamma.cl, lgamma.cl,
stirlerr_cycle_free.cl, pgamma_utils.cl, stirlerr_cycle_dependent.cl,
stirlerr.cl, bd0.cl,
dpois.cl, dgamma.cl, dnorm.cl, pnorm.cl, dbinom.cl
This subset is determined by the @all_depends metadata
in the six kernel files. Chapter 10 shows how to perform this extraction
for a new downstream package.
The port aims for bit-for-bit equivalence with the CPU nmath results on IEEE 754 hardware. The main sources of deviation are:
-cl-fp64-round-to-nearest
is specified. The OpenCLConfig struct in
openclPort.h probes the device and sets appropriate build
flags.sqrt, exp, and log with 1 ULP
error (IEEE 754 guarantees exactly rounded); others allow slightly more.
In practice, deviations are below numerical tolerance for all
statistical applications.lgamma, tgamma, expm1, and
log1p are part of the OpenCL 1.1+ specification and do
not need to be ported. The corresponding
.cl files include #ifndef guards so that the
ported version is only compiled if the built-in is absent.