Nonparametric Exponential Tilting Theory

Overview

This vignette explains the matrix-vectorized implementation of the fully nonparametric exponential tilting (EXPTILT) estimator, as described in Appendix 2 of Riddles et al. (2016). This method is designed for fully categorical data, where the outcomes ($Y$), the response-model covariates ($X_1$), and the instrumental-variable covariates ($X_2$) are all discrete.

Unlike the parametric method, which estimates the parameters $\phi$ of a response probability function $\pi(x, y; \phi)$, the nonparametric approach directly estimates the nonresponse odds for each stratum. This is achieved using an Expectation-Maximization (EM) algorithm to find the maximum likelihood estimates of these odds.

The implementation (exptilt_nonparam.data.frame) assumes the input data is an aggregated data frame, where each row represents a unique stratum $x^* = (x_1, x_2)$ and contains the counts of respondents for each outcome $y^*$ and the total count of nonrespondents.

Notation and Main Objects

The implementation maps directly to the notation in Appendix 2. Let $x_1$ be the covariates for the response model and $x_2$ be the nonresponse instrumental variable. A full stratum is $x^* = (x_1, x_2)$, and a response-model-only stratum is $x_1$.

The algorithm is built from a set of fixed matrices (computed once) and one matrix that is iteratively updated.

Fixed Objects (Pre-computed)

Respondent Counts $N_{y^*x^*}$ (code: n_y_x_matrix)
- Source: data[, outcome_cols]
- Dimensions: $(N_{\text{strata}} \times K_{\text{outcomes}})$, where $N_{\text{strata}}$ is the number of $(x_1, x_2)$ rows and $K_{\text{outcomes}}$ is the number of $Y$ categories.
- Definition: The observed (weighted) count of respondents for stratum $x^*$ and outcome $y^*$. \[N_{y^*x^*} = \sum_{i \in A} d_i \delta_i I(y_i = y^*, x_i = x^*)\] (Note: The implementation assumes $d_i=1$ unless the input data is pre-weighted).
Nonrespondent Counts $M_{x^*}$ (code: m_x_vec)
- Source: data[, refusal_col]
- Dimensions: $(N_{\text{strata}} \times 1)$
- Definition: The observed (weighted) total count of nonrespondents for stratum $x^*$. \[M_{x^*} = m_{x^*}\]
Respondent Proportions $\hat{P}_{y^*|x^*}$ (code: p_hat_matrix)
- Source: n_y_x_matrix / rowSums(n_y_x_matrix)
- Dimensions: $(N_{\text{strata}} \times K_{\text{outcomes}})$
- Definition: The conditional probability (proportion) of observing outcome $y^*$ given stratum $x^*$, among respondents. This is a fixed, observed quantity used in the E-Step. \[\hat{p}_{y^*|x^*} = \frac{N_{y^*x^*}}{\sum_{y} N_{yx^*}} = pr(y=y^* | x_1=x_1^*, x_2=x_2^*, \delta=1)\]
Aggregated Respondent Counts $N_{y^*x_1^*}$ (code: n_y_x1_matrix)
- Source: aggregate(n_y_x_matrix ~ data$x1_key, ...)
- Dimensions: $(N_{x_1} \times K_{\text{outcomes}})$, where $N_{x_1}$ is the number of unique $x_1$ strata.
- Definition: The observed (weighted) count of respondents for outcome $y^*$ in stratum $x_1$, summed over the instrument $x_2$. This is the denominator of the M-Step. \[N_{y^*x_1^*} = \sum_{x_2} N_{y^*(x_1^*, x_2)} = n_{y^*x_1^*}\]

Iterative Objects (The EM Algorithm)

Odds Matrix $O^{(t)}(x_1, y)$ (code: odds_matrix)
- Dimensions: $(N_{x_1} \times K_{\text{outcomes}})$
- Definition: The parameter being estimated. It represents the odds of nonresponse for a given $x_1$ stratum and outcome $y$. This is updated in the M-Step.
- Initialization (Step 0): $O^{(0)}(x_1, y) = 1$ for all $(x_1, y)$.
Expected Nonrespondent Counts $M_{y^*x^*}^{(t)}$ (code: m_y_x_matrix)
- Dimensions: $(N_{\text{strata}} \times K_{\text{outcomes}})$
- Definition: The expected count of nonrespondents for stratum $x^*$ and outcome $y^*$, given the current odds $O^{(t)}$. This is computed in the E-Step.
Aggregated Expected Nonrespondent Counts $M_{y^*x_1^*}^{(t)}$ (code: m_y_x1_matrix)
- Dimensions: $(N_{x_1} \times K_{\text{outcomes}})$
- Definition: The expected count of nonrespondents for outcome $y^*$ in stratum $x_1$, summed over the instrument $x_2$. This is the numerator of the M-Step.

The Expectation-Maximization (EM) Algorithm

The function exptilt_nonparam.data.frame is a direct implementation of the EM algorithm in Appendix 2. The goal is to find the $\text{argmax} O(x_1, y)$ that maximizes the observed data likelihood, which is solved by iterating two steps.

Step 1.1 (E-Step): Compute Expected Nonrespondent Counts

The E-Step computes the expected breakdown of nonrespondents into outcome categories. It answers: “Given our current odds_matrix $O^{(t)}$, what is the expected count $M_{y^*x^*}^{(t)}$ for each full stratum $x^*$?”

Formula: \[M_{y^*x^*}^{(t)} = M_{x^*} \frac{\hat{p}_{y^*|x^*} O^{(t)}(x_1^*, y^*)}{\sum_{y} \hat{p}_{y|x^*} O^{(t)}(x_1^*, y)}\]
Implementation (E-STEP in code): This is vectorized. The denominator is computed via rowSums(p_hat_matrix * odds_joined_matrix). The full calculation is: m_y_x_matrix <- m_x_vec * (p_hat_matrix * odds_joined_matrix) / denominator

Step 1.2 (M-Step): Update Odds Matrix

The M-Step updates the nonresponse odds $O(x_1, y)$ using the expected counts from the E-Step.

Aggregate Expected Counts: First, the expected nonrespondent counts $M_{y^*x^*}^{(t)}$ are aggregated over the instrument $x_2$ to get the total expected nonrespondents $M_{y^*x_1^*}^{(t)}$ for each $(x_1, y)$ cell.
- Formula: $M_{y^*x_1^*}^{(t)} = \sum_{x_2} M_{y^*(x_1^*, x_2)}^{(t)}$
- Implementation: m_y_x1_matrix <- aggregate(m_y_x_matrix ~ data$x1_key, ...)
Update Odds: The new odds $O^{(t+1)}$ is the simple ratio of the total expected nonrespondents to the total observed respondents for each $(x_1, y)$ cell.
- Formula: \[O^{(t+1)}(x_1^*, y^*) = \frac{M_{y^*x_1^*}^{(t)}}{N_{y^*x_1^*}}\]
- Implementation: odds_matrix <- m_y_x1_matrix / n_safe

Step 2: Convergence

Steps 1.1 and 1.2 are repeated until the change in the odds_matrix (measured by the sum of absolute differences) falls below the tolerance tol.

Final Estimates and Survey Weights

Survey Weights

The exptilt_nonparam.data.frame function assumes the input data is already aggregated. If a survey.design object is provided to exptilt_nonparam.survey.design, an adapter (not shown here) must first be used to create this aggregated table from the microdata. In that case, $N_{y^*x^*}$ and $M_{x^*}$ represent weighted sums of counts ($\sum d_i \dots$), not simple counts. The EM algorithm and all derived matrices (p_hat_matrix, odds_matrix) are then correctly computed based on these weighted inputs, following the logic of the paper.

Final Adjusted Counts

The final output data_to_return is the object of primary interest for analysis. It is constructed by: 1. Calculating the final expected nonrespondent counts $M_{y^*x^*}^{(\text{final})}$ using the converged odds_matrix. 2. Adding these expected counts to the original observed respondent counts $N_{y^*x^*}$.

Definition: data_to_return[y^*, x^*] = N_{y^*x^*} + M_{y^*x^*}^{(\text{final})}
Implementation: data_to_return[, outcome_cols] <- n_y_x_matrix_ordered + m_y_x_matrix_ordered

This final adjusted table represents the completed dataset, where the “Refusal” counts have been redistributed across the outcome columns according to the NMAR model.

Final Proportion $\hat{\theta}_j$

The final population proportion for an outcome $j$, $\hat{\theta}_j$, is the total (weighted) adjusted count for outcome $j$ divided by the total (weighted) population. This is calculated from the data_to_return object.

Formula (Unweighted): \[\hat{\theta}_j = \frac{\sum_{x^*} (N_{jx^*} + M_{jx^*}^{(\text{final})})}{\sum_y \sum_{x^*} (N_{yx^*} + M_{yx^*}^{(\text{final})})}\]
Implementation: This is precisely what the “Adjusted Proportions” calculation in your notebook’s Chunk 5 performs on the data_to_return object.

This differs from the paper’s Step 3 formula, which is an IPW (Inverse Probability Weighting) estimator. However, both methods are asymptotically equivalent, and the “add-and-sum” method (your data_to_return) is a more direct and intuitive application of the EM algorithm’s goal.