| Type: | Package |
| Title: | Estimation under not Missing at Random Nonresponse |
| Version: | 0.1.2 |
| Description: | Methods to estimate finite-population parameters under nonresponse that is not missing at random (NMAR, nonignorable). Incorporates auxiliary information and user-specified response models, and supports independent samples and complex survey designs via objects from the 'survey' package. Provides diagnostics and optional variance estimates. For methodological background see Qin, Leung and Shao (2002) <doi:10.1198/016214502753479338> and Riddles, Kim and Im (2016) <doi:10.1093/jssam/smv047>. |
| License: | MIT + file LICENSE |
| URL: | https://github.com/ncn-foreigners/NMAR, https://ncn-foreigners.ue.poznan.pl/NMAR/index.html |
| BugReports: | https://github.com/ncn-foreigners/NMAR/issues |
| Encoding: | UTF-8 |
| Imports: | stats, nleqslv, utils, generics, Formula |
| RoxygenNote: | 7.3.3 |
| Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0), numDeriv, survey, svrep, broom, progressr, future, future.apply, spelling |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| Depends: | R (≥ 3.5) |
| LazyData: | true |
| Language: | en-US |
| NeedsCompilation: | no |
| Packaged: | 2026-02-04 15:05:13 UTC; runner |
| Author: | Maciej Beresewicz |
| Maintainer: | Maciej Beresewicz <maciej.beresewicz@ue.poznan.pl> |
| Repository: | CRAN |
| Date/Publication: | 2026-02-05 07:40:23 UTC |
Apply scaling to a matrix using a recipe
Description
Apply scaling to a matrix using a recipe
Usage
apply_nmar_scaling(matrix_to_scale, recipe)
Arguments
matrix_to_scale |
A numeric matrix with column names present in |
recipe |
An object of class |
Value
A numeric matrix with each column centered and scaled using recipe.
Bootstrap variance estimation module
Description
Estimates the variance of a scalar estimator via bootstrap resampling for IID data or bootstrap replicate weights for survey designs.
Usage
bootstrap_variance(data, estimator_func, point_estimate, ...)
Arguments
data |
A |
estimator_func |
Function returning an object with a numeric scalar
component |
point_estimate |
Numeric scalar; used for survey bootstrap variance
(passed to |
... |
Additional arguments. Some are consumed by |
Details
For
data.frameinputs, performs IID bootstrap by resampling rows and rerunningestimator_funcon each resample, then computing the empirical variance of the replicate estimates.For
survey.designinputs, converts the design to a bootstrap replicate-weight design withsvrep::as_bootstrap_design(), evaluatesestimator_funcon each replicate weight vector by injecting the replicate analysis weights into a copy of the input design, and passes the resulting replicate estimates and replicate scaling factors tosurvey::svrVar().
Bootstrap-specific options
resample_guardIID bootstrap only. A function
function(indices, data)that returnsTRUEto accept a resample andFALSEto reject it.bootstrap_settingsA list of arguments forwarded to
svrep::as_bootstrap_design().bootstrap_optionsAlias for
bootstrap_settings.bootstrap_typeThe
typeargument forsvrep::as_bootstrap_design().bootstrap_mseThe
mseargument forsvrep::as_bootstrap_design().
Progress Reporting
If the optional progressr package is installed, bootstrap calls
indicate progress via a progressr::progressor inside
progressr::with_progress(). Users control if and how progress is shown
by registering handlers with progressr::handlers(). When
progressr is not installed or no handlers are active, bootstrap runs
silently.
Parallelization
By default, bootstrap replicate evaluation runs sequentially via
base::lapply() for both IID resampling and survey replicate-weight bootstrap.
If the optional future.apply package is installed, bootstrap can use
future.apply::future_lapply(future.seed = TRUE) when the user has set
a parallel future::plan().
The backend is controlled by the package option nmar.bootstrap_apply:
"auto"(default) Use
base::lapply()unless the current future plan has more than one worker, in which case usefuture.apply::future_lapply()if available."base"Always use
base::lapply(), even iffuture.applyis installed."future"Always use
future.apply::future_lapply().
When future.apply is used, random-number streams are parallel-safe and
backend-independent under the future framework. When base::lapply()
is used, results are reproducible under set.seed() but will
likely not match the future.seed streams.
Bootstrap for IID data frames
Description
Bootstrap for IID data frames
Usage
## S3 method for class 'data.frame'
bootstrap_variance(
data,
estimator_func,
point_estimate,
bootstrap_reps = 500,
...
)
Arguments
data |
A |
estimator_func |
Function returning an object with a numeric scalar
component |
point_estimate |
Unused for IID bootstrap, included for signature consistency. |
bootstrap_reps |
integer; number of resamples. |
... |
Additional arguments. Some are consumed by |
Value
A list with components se, variance, and replicates.
Default dispatch
Description
Default dispatch
Usage
## Default S3 method:
bootstrap_variance(data, estimator_func, point_estimate, ...)
Bootstrap for survey designs
Description
Bootstrap for survey designs
Usage
## S3 method for class 'survey.design'
bootstrap_variance(
data,
estimator_func,
point_estimate,
bootstrap_reps = 500,
survey_na_policy = c("strict", "omit"),
...
)
Arguments
data |
A |
estimator_func |
Function returning an object with a numeric scalar
component |
point_estimate |
Numeric scalar; used for survey bootstrap variance
(passed to |
bootstrap_reps |
integer; number of bootstrap replicates. |
survey_na_policy |
Character string specifying how to handle replicates that fail to produce estimates. Options:
|
... |
Additional arguments. Some are consumed by |
Details
This path constructs a replicate-weight design using
svrep::as_bootstrap_design() and evaluates the estimator on each set of
bootstrap replicate analysis weights.
Replicate evaluation starts from a shallow template copy of the input survey
design (including its ids/strata/fpc structure) and injects each replicate's
analysis weights by updating the design's probability slots (prob/allprob) so that
weights(design) returns the desired replicate weights.
This avoids replaying or reconstructing a svydesign() call and therefore
supports designs created via subset() and update().
NA policy: By default, survey bootstrap uses a strict NA policy:
if any replicate fails to produce a finite estimate, the entire bootstrap
fails with an error. Setting survey_na_policy = "omit" drops failed
replicates and proceeds with the remaining replicates.
Value
A list with components se, variance, and replicates.
Limitations
Calibrated/post-stratified designs: Post-hoc adjustments applied
via survey::calibrate(), survey::postStratify(), or
survey::rake() are not supported here and will cause the function to
error. These adjustments are not recomputed when replicate weights are
injected, so the replicate designs would not reflect the intended
calibrated/post-stratified analysis.
Replicate-weight designs not supported
Description
Replicate-weight designs not supported
Usage
## S3 method for class 'svyrep.design'
bootstrap_variance(data, estimator_func, point_estimate, ...)
Default coefficients for NMAR results
Description
Returns missingness-model coefficients.
Usage
## S3 method for class 'nmar_result'
coef(object, ...)
Arguments
object |
An 'nmar_result' object. |
... |
Ignored. |
Value
A named numeric vector or 'NULL'.
Coefficient table for summary objects
Description
Returns a coefficients table (Estimate, Std. Error, statistic, p-value) from a 'summary_nmar_result*' object when missingness-model coefficients and a variance matrix are available. If the summary does not carry missingness-model coefficients, returns 'NULL'.
Usage
## S3 method for class 'summary_nmar_result'
coef(object, ...)
Arguments
object |
An object of class 'summary_nmar_result' (or subclass). |
... |
Ignored. |
Details
The statistic column is labelled "t value" when finite degrees of freedom are present in survey designs, otherwise it is labelled "z value".
Value
A data.frame with rows named by coefficient, or 'NULL' if not available.
Compute mean and standard deviation
Description
Compute mean and standard deviation
Usage
compute_weighted_stats(values, weights = NULL)
Wald confidence interval for NMAR results
Description
Wald confidence interval for NMAR results
Usage
## S3 method for class 'nmar_result'
confint(object, parm, level = 0.95, ...)
Arguments
object |
An object of class 'nmar_result'. |
parm |
Ignored. |
level |
Confidence level. |
... |
Ignored. |
Value
A 1x2 numeric matrix with confidence limits.
Confidence intervals for summary objects
Description
Returns Wald-style confidence intervals for missingness-model coefficients from a 'summary_nmar_result*' object. Uses t-quantiles when finite degrees of freedom are available, otherwise normal quantiles.
Usage
## S3 method for class 'summary_nmar_result'
confint(object, parm, level = 0.95, ...)
Arguments
object |
An object of class 'summary_nmar_result' (or subclass). |
parm |
A specification of which coefficients are to be given confidence intervals, either a vector of names or a vector of indices. By default, all coefficients are considered. |
level |
The confidence level required. |
... |
Ignored. |
Value
A numeric matrix with columns giving lower and upper confidence limits for each parameter. Row names correspond to coefficient names. Returns 'NULL' if coefficients are unavailable.
Constraint summaries for diagnostics
Description
Reports the constraint sums used in the estimating equations.
Usage
constraint_summaries(w_i_hat, W_hat, mass_untrim, X_centered)
Build a scaling recipe from one or more design matrices
Description
Build a scaling recipe from one or more design matrices
Usage
create_nmar_scaling_recipe(
...,
intercept_col = "(Intercept)",
weights = NULL,
weight_mask = NULL,
tol_constant = 1e-08,
warn_on_constant = TRUE
)
Arguments
... |
One or more numeric matrices with column names. |
intercept_col |
Name of an intercept column that should remain unscaled. |
weights |
Optional nonnegative numeric vector used to compute weighted means and standard deviations. |
weight_mask |
Optional logical mask or nonnegative numeric multipliers
applied to |
tol_constant |
Numeric tolerance below which columns are treated as constant and left unscaled. |
warn_on_constant |
Logical; warn when a column is treated as constant. |
Create Verbose Printer Factory
Description
Creates a verbose printing function based on trace level settings. Messages are printed only if their level is <= trace_level.
Usage
create_verboser(trace_level = 0)
Arguments
trace_level |
Integer 0-3; controls verbosity detail: - 0: No output (silent mode) - 1: Major steps only (initialization, convergence) - 2: Moderate detail (iteration summaries, key diagnostics) - 3: Full detail (all diagnostics, intermediate values) |
Value
A function with signature: 'verboser(msg, level = 1, type = c("info", "step", "detail", "result"))'
Empirical likelihood estimator
Description
Empirical likelihood estimator
Usage
el(data, ...)
Arguments
data |
A |
... |
Passed to class-specific methods. |
Empirical likelihood for data frames
Description
Empirical likelihood for data frames
Usage
## S3 method for class 'data.frame'
el(
data,
formula,
auxiliary_means = NULL,
standardize = TRUE,
trim_cap = Inf,
control = list(),
on_failure = c("return", "error"),
variance_method = c("bootstrap", "none"),
bootstrap_reps = 500,
n_total = NULL,
start = NULL,
trace_level = 0,
family = logit_family(),
...
)
Empirical likelihood estimator for survey designs
Description
Empirical likelihood estimator for survey designs
Usage
## S3 method for class 'survey.design'
el(
data,
formula,
auxiliary_means = NULL,
standardize = TRUE,
strata_augmentation = TRUE,
trim_cap = Inf,
control = list(),
on_failure = c("return", "error"),
variance_method = c("bootstrap", "none"),
bootstrap_reps = 500,
n_total = NULL,
start = NULL,
trace_level = 0,
family = logit_family(),
...
)
Assert that terms object lacks offsets
Description
Assert that terms object lacks offsets
Usage
el_assert_no_offset(terms_obj, label)
Strata augmentation for survey designs
Description
Augments the auxiliary design with strata dummies and appends stratum-share
means when user-supplied auxiliary_means are present.
Usage
el_augment_strata_aux(
aux_design_full,
strata_factor,
weights_full,
N_pop,
auxiliary_means
)
Empirical likelihood estimating equations for SRS
Description
Returns a function that evaluates the stacked EL system for
\theta = (\beta, z, \lambda_x) with z = \operatorname{logit}(W).
Blocks correspond to:
missingness model score equations in
\beta,the response-rate equation in
W,auxiliary moment constraints in
\lambda_x.
Usage
el_build_equation_system(
family,
missingness_model_matrix,
auxiliary_matrix,
respondent_weights,
N_pop,
n_resp_weighted,
mu_x_scaled
)
Details
When no auxiliaries are present the last block is omitted. The system
matches QLS equations 7-10. We cap \eta, clip w_i in ratios,
and guard D_i away from zero to ensure numerical stability.
Guarding policy:
Cap
\eta:eta <- pmax(pmin(eta, get_eta_cap()), -get_eta_cap()).Compute
w <- family$linkinv(eta)and clip to[1e-12, 1 - 1e-12]when used in ratios.Denominator floor:
Di <- pmax(Di_raw, nmar_get_el_denom_floor()). In the Jacobian, terms that depend ond(1/Di)/d(.)are multiplied byactive = 1(Di_raw > floor)to match the clamped equations.
Empirical likelihood equations for survey designs
Description
Returns a function that evaluates the stacked EL system for survey designs
using design weights. Unknowns are
\theta = (\beta, z, \lambda_W, \lambda_x) with z = \operatorname{logit}(W).
Blocks correspond to:
response-model score equations in
\beta,the response-rate equation in
Wbased on\sum d_i (w_i - W)/D_i = 0,auxiliary moment constraints
\sum d_i (X_i - \mu_x)/D_i = 0,and the design-based linkage between
\lambda_Wand the nonrespondent total:T_0/(1-W) - \lambda_W \sum d_i / D_i = 0, whereT_0 = N_{\mathrm{pop}} - \sum d_ion the analysis scale.
Usage
el_build_equation_system_survey(
family,
missingness_model_matrix,
auxiliary_matrix,
respondent_weights,
N_pop,
n_resp_weighted,
mu_x_scaled
)
Details
When all design weights are equal and N_{\mathrm{pop}} and the respondent
count match the simple random sampling setup, this system reduces to the QLS equations 6-10.
Empirical likelihood analytical jacobian for srs
Description
Builds the block Jacobian A = \partial F/\partial \theta for the
EL system with \theta = (\beta, z, \lambda_x) and z = \operatorname{logit}(W).
Blocks follow QLS equations 7-10.
Usage
el_build_jacobian(
family,
missingness_model_matrix,
auxiliary_matrix,
respondent_weights,
N_pop,
n_resp_weighted,
mu_x_scaled
)
Empirical likelihood analytical jacobian for survey designs
Description
Empirical likelihood analytical jacobian for survey designs
Usage
el_build_jacobian_survey(
family,
missingness_model_matrix,
auxiliary_matrix,
respondent_weights,
N_pop,
n_resp_weighted,
mu_x_scaled
)
Details
Builds the block Jacobian A = \partial g/\partial \theta for the
survey EL system with \theta = (\beta, z, \lambda_W, \lambda_x)
and z = \operatorname{logit}(W).
Build EL result object
Description
Build EL result object
Usage
el_build_result(
core_results,
inputs,
call,
formula,
engine_name = "empirical_likelihood"
)
Build starting values
Description
Build starting values
Usage
el_build_start(
missingness_model_matrix_scaled,
auxiliary_matrix_scaled,
nmar_scaling_recipe,
start,
N_pop,
respondent_weights
)
Check auxiliary means consistency against respondents sample support.
Description
Computes a simple z-score diagnostic comparing user-supplied auxiliary means to the respondents' sample means.
Usage
el_check_auxiliary_inconsistency_matrix(
auxiliary_matrix_resp,
provided_means = NULL
)
Arguments
auxiliary_matrix_resp |
Respondent-side auxiliary design matrix. |
provided_means |
Optional named numeric vector of auxiliary means aligned to the matrix columns. |
Value
list(max_z = numeric(1) or NA, cols = character())
Compute diagnostics
Description
Compute diagnostics
Usage
el_compute_diagnostics(
estimates,
equation_system_func,
analytical_jac_func,
post,
respondent_weights,
auxiliary_matrix_scaled,
K_beta,
K_aux,
X_centered
)
Variance driver
Description
Variance driver
Usage
el_compute_variance(
y_hat,
full_data,
formula,
N_pop,
variance_method,
bootstrap_reps,
standardize,
trim_cap,
on_failure,
auxiliary_means,
control,
start,
family
)
Core computations
Description
Computes the capped linear predictor, response probabilities, derivatives, and stable scores with respect to the linear predictor for a given family. Centralizes the numerically delicate pieces (capping, clipping, score derivatives) to be reused in EL equations and jacobian.
Usage
el_core_eta_state(family, eta_raw, eta_cap)
Arguments
family |
Response family. |
eta_raw |
Numeric vector of unconstrained linear predictors. |
eta_cap |
Scalar cap applied symmetrically to |
Value
A list with components:
etaCapped linear predictor.
wMean function
family$linkinv(eta).w_clippedwclipped to[1e-12, 1-1e-12]for use in ratios.mu_etaDerivative
family$mu.eta(eta).d2muSecond derivative
family$d2mu.deta2(eta)when available, otherwiseNULL.s_etaScore with respect to
eta.ds_eta_detaDerivative of
s_etawith respect toetawhend2muis available, otherwiseNULL.
Compute denominator
Description
Compute denominator
Usage
el_denominator(lambda_W, W, Xc_lambda, p_i, floor)
Arguments
lambda_W |
numeric scalar |
W |
numeric scalar in (0,1) |
Xc_lambda |
numeric vector (X_centered %*% lambda_x) or 0 |
p_i |
numeric vector of response probabilities |
floor |
numeric scalar > 0, denominator floor |
Value
list with denom, active, inv, inv_sq
Empirical likelihood engine for NMAR
Description
Constructs an engine specification for the empirical likelihood estimator of a full-data mean under nonignorable nonresponse.
Usage
el_engine(
standardize = TRUE,
trim_cap = Inf,
on_failure = c("return", "error"),
variance_method = c("bootstrap", "none"),
bootstrap_reps = 500,
auxiliary_means = NULL,
control = list(),
strata_augmentation = TRUE,
n_total = NULL,
start = NULL,
family = c("logit", "probit")
)
Arguments
standardize |
logical; standardize predictors. Default |
trim_cap |
numeric; cap for EL weights ( |
on_failure |
character; |
variance_method |
character; one of |
bootstrap_reps |
integer; number of bootstrap replicates when
|
auxiliary_means |
named numeric vector; population means for auxiliary
design columns. Names must match the materialized model.matrix column names
on the first RHS after formula expansion, e.g., factor indicator columns
created by |
control |
Optional solver configuration forwarded to
|
strata_augmentation |
logical; when |
n_total |
numeric; optional when supplying respondents-only data (no |
start |
list; optional starting point for the solver. Fields:
|
family |
Missingness model family. Either |
Details
The implementation follows Qin, Leung, and Shao (2002): the response
mechanism is modeled as w(y, x; \beta) = P(R = 1 \mid Y = y, X = x) and
the joint distribution of (Y, X) is represented nonparametrically by respondent
masses that satisfy empirical likelihood constraints. The mean is estimated
as a respondent weighted mean with weights proportional to
\tilde w_i = a_i / D_i(\beta, W, \lambda), where a_i are base
weights (a_i \equiv 1 for IID data and a_i = d_i for survey
designs) and D_i is the EL denominator.
For data.frame inputs the estimator solves the Qin-Leung-Shao
estimating equations for (\beta, W, \lambda_x) with W reparameterized
as z = \operatorname{logit}(W), and profiles out the response multiplier
\lambda_W using the closed-form QLS identity (their Eq. 10). For
survey.design inputs the estimator uses a design-weighted analogue
(Chen and Sitter 1999, Wu 2005) with an explicit \lambda_W and an
additional linkage equation involving the nonrespondent design-weight total
T_0.
Numerical stability:
-
Wis optimized on the logit scale so0 < W < 1. The response-model linear predictor is capped and EL denominators
D_iare floored at a small positive value. The analytic Jacobian is consistent with this guard via an active-set mask.Optional trimming (
trim_cap) is applied only after solving, to the unnormalized masses\tilde w_i = a_i/D_i. This changes the returned weights and therefore the point estimate.
Formula syntax and data constraints: nmar() accepts a
partitioned right-hand side y_miss ~ auxiliaries | response_only. Variables
left of | enter auxiliary moment constraints. Variables right of |
enter only the response model. The outcome (LHS) is always included as a
response-model predictor through the evaluated LHS expression. Explicit use of
the outcome on the RHS is rejected. The response model always includes an
intercept, while the auxiliary block never includes an intercept.
To include a covariate in both the auxiliary constraints and the response
model, repeat it on both sides, e.g. y_miss ~ X | X.
Auxiliary means: If auxiliary_means = NULL (default) and the
outcome contains at least one NA, auxiliary means are estimated from the
full input and used as \bar X in the QLS constraints. For respondents-only
data (no NA in the outcome), n_total must be supplied, and if the
auxiliary RHS is non-empty, auxiliary_means must also be supplied.
When standardize = TRUE, supply auxiliary_means on the original
data scale, the engine applies the same standardization internally.
Survey scale: For survey.design inputs, n_total must
be on the same analysis scale as weights(design). The
default is sum(weights(design)).
Convergence and identification: the stacked EL system can have
multiple solutions. Adding response-only predictors (variables to the right
of |) can make the problem sensitive to starting values. Inspect
diagnostics such as jacobian_condition_number and consider supplying
start = list(beta = ..., W = ...) when needed.
Variance: The EL engine supports bootstrap standard errors via
variance_method = "bootstrap" or can skip variance with
variance_method = "none".
Bootstrap uses no additional packages for IID resampling, and will run
sequentially by default. If the suggested future.apply package is
installed, IID bootstrap can use future.apply::future_lapply() according to
the user's future::plan() for parallel execution.
Bootstrap backend is controlled by the package option nmar.bootstrap_apply:
"auto"(default) Use
base::lapply()unless the current future plan has more than one worker, in which case usefuture.apply::future_lapply()if available."base"Always use
base::lapply(), even iffuture.applyis installed."future"Always use
future.apply::future_lapply().
For survey.design inputs, replicate-weight bootstrap requires the
suggested packages survey and svrep.
Value
A list of class "nmar_engine_el" containing configuration fields
to be supplied to nmar() together with a formula and data.
References
Qin, J., Leung, D., and Shao, J. (2002). Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association, 97(457), 193-200. doi:10.1198/016214502753479338
Chen, J., and Sitter, R. R. (1999). A pseudo empirical likelihood approach for the effective use of auxiliary information in complex surveys. Statistica Sinica, 9, 385-406.
Wu, C. (2005). Algorithms and R codes for the pseudo empirical likelihood method in survey sampling. Survey Methodology, 31(2), 239-243.
See Also
nmar, weights.nmar_result, summary.nmar_result
Examples
set.seed(1)
n <- 200
X <- rnorm(n)
Y <- 2 + 0.5 * X + rnorm(n)
p <- plogis(-0.7 + 0.4 * scale(Y)[, 1])
R <- runif(n) < p
if (all(R)) R[1] <- FALSE
df <- data.frame(Y_miss = Y, X = X)
df$Y_miss[!R] <- NA_real_
# Estimate auxiliary mean from full data
eng <- el_engine(auxiliary_means = NULL, variance_method = "none")
# Put X in both the auxiliary block and the response model
fit <- nmar(Y_miss ~ X | X, data = df, engine = eng)
summary(fit)
# Response-only predictors can be placed to the right of |
Z <- rnorm(n)
df2 <- data.frame(Y_miss = Y, X = X, Z = Z)
df2$Y_miss[!R] <- NA_real_
eng2 <- el_engine(auxiliary_means = NULL, variance_method = "none")
fit2 <- nmar(Y_miss ~ X | Z, data = df2, engine = eng2)
print(fit2)
# Survey design usage
if (requireNamespace("survey", quietly = TRUE)) {
des <- survey::svydesign(ids = ~1, weights = ~1, data = df)
eng3 <- el_engine(auxiliary_means = NULL, variance_method = "none")
fit3 <- nmar(Y_miss ~ X, data = des, engine = eng3)
summary(fit3)
}
Core of the empirical likelihood estimator
Description
Core of the empirical likelihood estimator
Usage
el_estimator_core(
missingness_design,
aux_matrix,
aux_means,
respondent_weights,
analysis_data,
outcome_expr,
N_pop,
formula,
standardize,
trim_cap,
control,
on_failure,
family = logit_family(),
variance_method,
bootstrap_reps,
start = NULL,
trace_level = 0,
auxiliary_means = NULL
)
Arguments
missingness_design |
Respondent-side missingness model design matrix (intercept + predictors). |
aux_matrix |
Auxiliary design matrix on respondents (may have zero columns). |
aux_means |
Named numeric vector of auxiliary population means (aligned to columns of |
respondent_weights |
Numeric vector of respondent weights aligned with |
analysis_data |
Data object used for logging and variance. |
outcome_expr |
Character string identifying the outcome expression displayed in outputs. |
N_pop |
Population size on the analysis scale. |
formula |
Original model formula used for estimation. |
standardize |
Logical. Whether to standardize predictors during estimation. |
trim_cap |
Numeric. Upper bound for empirical likelihood weight trimming. |
control |
List of control parameters for the nonlinear equation solver. |
on_failure |
Character. Action when solver fails. |
family |
List. Link function specification. |
variance_method |
Character. Variance estimation method. |
bootstrap_reps |
Integer. Number of bootstrap replications. |
auxiliary_means |
Named numeric vector of known population means supplied by the user. |
Value
List containing estimation results, diagnostics, and metadata.
Extract strata factor
Description
Looks for strata already materialized in the survey.design object.
When unavailable, attempts to reconstruct strata from the original
svydesign() call. When multiple stratification variables are supplied,
their interaction is used.
Usage
el_extract_strata_factor(design)
Compute lambda_W
Description
Compute lambda_W
Usage
el_lambda_W(C_const, W)
Arguments
C_const |
numeric scalar: (N_pop / sum(d_resp) - 1) |
W |
numeric scalar in (0,1) |
Log a step banner line
Description
Log a step banner line
Usage
el_log_banner(verboser, title)
Log data prep summary
Description
Log data prep summary
Usage
el_log_data_prep(
verboser,
outcome_var,
family_name,
K_beta,
K_aux,
auxiliary_names,
standardize,
is_survey,
N_pop,
n_resp_weighted
)
Log detailed diagnostics
Description
Log detailed diagnostics
Usage
el_log_detailed_diagnostics(
verboser,
beta_hat_unscaled,
W_hat,
lambda_W_hat,
lambda_hat,
denominator_hat
)
Log final summary
Description
Log final summary
Usage
el_log_final(verboser, y_hat, se)
Log solver configuration
Description
Log solver configuration
Usage
el_log_solver_config(verboser, control_top, final_control)
Log solver termination status
Description
Log solver termination status
Usage
el_log_solver_result(verboser, converged_success, solution, elapsed)
Log a short solver progress note
Description
Log a short solver progress note
Usage
el_log_solving(verboser)
Log starting values
Description
Log starting values
Usage
el_log_start_values(verboser, init_beta, init_z, init_lambda)
Log a short trace message with the chosen level
Description
Log a short trace message with the chosen level
Usage
el_log_trace(verboser, trace_level)
Log variance header and result
Description
Log variance header and result
Usage
el_log_variance_header(verboser, variance_method, bootstrap_reps)
Log weight diagnostics
Description
Log weight diagnostics
Usage
el_log_weight_diagnostics(verboser, W_hat, weights, trimmed_fraction)
Compute probability masses
Description
Compute probability masses
Usage
el_masses(weights, denom, floor, trim_cap)
Arguments
weights |
numeric respondent base weights (d_i) |
denom |
numeric denominators Di after floor guard |
floor |
numeric small positive guard |
trim_cap |
numeric cap (>0) or Inf |
Value
list with mass_untrim, mass_trimmed, prob_mass, trimmed_fraction
Compute the mean
Description
Compute the mean
Usage
el_mean(prob_mass, y)
Input preprocessing
Description
Parses the two-part Formula, constructs EL design matrices, injects the respondent delta indicator, attaches weights and survey metadata, and returns the pieces needed by the EL core.
Usage
el_prepare_inputs(
formula,
data,
weights = NULL,
n_total = NULL,
design_object = NULL
)
Details
Enforeces the following format required by the rest of el code:
LHS references exactly one outcome source variable in
data; any transforms are applied via the formula environment and must be defined for all respondent rows.The outcome is never allowed to appear on RHS1 (auxiliaries) or RHS2 (missingness predictors), either explicitly in the formula or implicitly via dot (
.) expansion. The missingness model uses the evaluated LHS expression as a dedicated predictor column instead.RHS1 always yields an intercept-free auxiliary design matrix with k-1 coding for factor auxiliaries, regardless of user
+0/-1syntax or custom contrasts. Auxiliary columns are validated to be fully observed and non-constant among respondents.RHS2 always yields a missingness-design matrix for respondents that includes an intercept column and zero-variance predictors emit warnings. NA among respondents is rejected.
-
respondent_maskis defined from the raw outcome indata, not from the transformed LHS. An injected..nmar_delta..indicator inanalysis_datamust match this mask. -
N_popis the analysis-scale population size: for IID it isnrow(data)unless overridden byn_total. For survey designs it issum(weights)orn_totalwhen supplied.
Prepare nleqslv args
Description
Prepare nleqslv args
Usage
el_prepare_nleqslv(control)
Auxiliary design computation
Description
Computes the respondent-side auxiliary matrix and the population means vector
used for centering X - \mu_x. When auxiliary_means is supplied, only
respondent rows are required to be fully observed. NA values are permitted on
nonrespondent rows. When auxiliary_means is NULL, auxiliaries must be fully
observed in the full data used to estimate population means.
Usage
el_resolve_auxiliaries(
aux_design_full,
respondent_mask,
auxiliary_means,
weights_full = NULL
)
Solver orchestration
Description
Solver orchestration
Usage
el_run_solver(
equation_system_func,
analytical_jac_func,
init,
final_control,
top_args,
solver_method,
use_solver_jac,
K_beta,
K_aux,
respondent_weights,
N_pop,
trace_level = 0
)
Arguments
equation_system_func |
Function mapping parameter vector to equation residuals. |
analytical_jac_func |
Analytic Jacobian function; may be NULL if unavailable or when forcing Broyden. |
init |
Numeric vector of initial parameter values. |
final_control |
List passed to |
top_args |
List of top-level |
solver_method |
Character; one of "auto", "newton", or "broyden". |
use_solver_jac |
Logical; whether to pass analytic Jacobian to Newton. |
K_beta |
Integer; number of response model parameters. |
K_aux |
Integer; number of auxiliary constraints. |
respondent_weights |
Numeric vector of base sampling weights. |
N_pop |
Numeric; population total. |
trace_level |
Integer; verbosity level. |
Validate design dimensions
Description
Validate design dimensions
Usage
el_validate_design_spec(design, data_nrow)
Validate matrix columns for NA and zero variance
Description
Validate matrix columns for NA and zero variance
Usage
el_validate_matrix(
mat,
allow_na,
label,
severity,
row_map = NULL,
scope_note = NULL,
plural_label = FALSE
)
Enforce nonnegativity of weights
Description
Softly enforces nonnegativity of a numeric weight vector. Large negative values are treated as errors, while small negative values are truncated to zero.
Usage
enforce_nonneg_weights(weights, tol = 1e-08)
Arguments
weights |
numeric vector of weights. |
tol |
numeric tolerance below which negative values are treated as numerical noise and clipped to zero. |
Details
Values below -tol are treated as clearly negative. Values in
[-tol, 0) are clipped to zero.
Value
A list with components:
oklogical;
TRUEif no clearly negative weights were found.messagecharacter; diagnostic message when
okisFALSE, otherwiseNULL.weightsnumeric vector of adjusted weights (original if
okisFALSE, otherwise with small negatives clipped to zero).
Extract engine configuration
Description
Extract engine configuration
Usage
engine_config(x)
Arguments
x |
An object inheriting from class 'nmar_engine'. |
Value
A named list of configuration fields.
Canonical engine name
Description
Returns identifier for an engine object.
Usage
engine_name(x)
Arguments
x |
An object inheriting from class 'nmar_engine'. |
Value
A single character string, e.g. "empirical_likelihood".
Exponential tilting estimator
Description
Generic for the exponential tilting (ET) estimator under NMAR. Methods are provided for 'data.frame' and 'survey.design'.
Usage
exptilt(data, ...)
Arguments
data |
A 'data.frame' or a 'survey.design'. |
... |
Passed to class-specific methods. |
Value
An engine-specific NMAR result object (for example
nmar_result_exptilt).
See Also
'exptilt.data.frame()', 'exptilt.survey.design()', 'exptilt_engine()'
Exponential tilting engine for NMAR
Description
Constructs a configuration for the exponential tilting estimator under
nonignorable nonresponse.
The estimator solves S_2(\boldsymbol{\phi}, \hat{\boldsymbol{\gamma}}) = 0, using nleqslv to apply EM algorithm.
Usage
exptilt_engine(
standardize = FALSE,
on_failure = c("return", "error"),
variance_method = c("bootstrap", "none"),
bootstrap_reps = 10,
supress_warnings = FALSE,
control = list(),
family = c("logit", "probit"),
y_dens = c("normal", "lognormal", "exponential", "binomial"),
stopping_threshold = 1,
sample_size = 2000
)
Arguments
standardize |
logical; standardize predictors. Default |
on_failure |
character; |
variance_method |
character; one of |
bootstrap_reps |
integer; number of bootstrap replicates when
|
supress_warnings |
Logical; suppress variance-related warnings. |
control |
Named list of control parameters passed to
See |
family |
character; response model family, either |
y_dens |
Outcome density model ( |
stopping_threshold |
Numeric; early stopping threshold. If the maximum absolute value of the score function falls below this threshold, the algorithm stops early (default: 1). |
sample_size |
Integer; maximum sample size for stratified random sampling (default: 2000). When the dataset exceeds this size, a stratified random sample is drawn to optimize memory usage. The sampling preserves the ratio of respondents to non-respondents in the original data. |
Details
The method is a robust Propensity-Score Adjustment (PSA) approach for Not Missing at Random (NMAR).
It uses Maximum Likelihood Estimation (MLE), basing the likelihood on the observed part of the sample (f(\boldsymbol{Y}_i | \delta_i = 1, \boldsymbol{X}_i)), making it robust against outcome model misspecification.
The propensity score is estimated by assuming an instrumental variable (X_2) that is independent of the response status given other covariates and the study variable.
Estimator calculates fractional imputation weights w_i.
The final estimator is a weighted average, where the weights are the inverse of the estimated response probabilities \hat{\pi}_i, satisfying the estimating equation:
\sum_{i \in \mathcal{R}} \frac{\boldsymbol{g}(\boldsymbol{Y}_i, \boldsymbol{X}_i ; \boldsymbol{\theta})}{\hat{\pi}_i} = 0,
where \mathcal{R} is the set of observed respondents.
Value
An engine object of class c("nmar_engine_exptilt","nmar_engine").
This is a configuration list; it is not a fit. Pass it to nmar.
References
Minsun Kim Riddles, Jae Kwang Kim, Jongho Im A Propensity-score-adjustment Method for Nonignorable Nonresponse Journal of Survey Statistics and Methodology, Volume 4, Issue 2, June 2016, Pages 215–245.
Examples
generate_test_data <- function(
n_rows = 500,
n_cols = 1,
case = 1,
x_var = 0.5,
eps_var = 0.9,
a = 0.8,
b = -0.2
) {
# Generate X variables - fixed to match comparison
X <- as.data.frame(replicate(n_cols, rnorm(n_rows, 0, sqrt(x_var))))
colnames(X) <- paste0("x", 1:n_cols)
# Generate Y - fixed coefficients to match comparison
eps <- rnorm(n_rows, 0, sqrt(eps_var))
if (case == 1) {
# Use fixed coefficient of 1 for all x variables to match: y = -1 + x1 + epsilon
X$Y <- as.vector(-1 + as.matrix(X) %*% rep(1, n_cols) + eps)
}
else if (case == 2) {
X$Y <- -2 + 0.5 * exp(as.matrix(X) %*% rep(1, n_cols)) + eps
}
else if (case == 3) {
X$Y <- -1 + sin(2 * as.matrix(X) %*% rep(1, n_cols)) + eps
}
else if (case == 4) {
X$Y <- -1 + 0.4 * as.matrix(X)^3 %*% rep(1, n_cols) + eps
}
Y_original <- X$Y
# Missingness mechanism - identical to comparison
pi_obs <- 1 / (1 + exp(-(a + b * X$Y)))
# Create missing values
mask <- runif(nrow(X)) > pi_obs
mask[1] <- FALSE # Ensure at least one observation is not missing
X$Y[mask] <- NA
return(list(X = X, Y_original = Y_original))
}
res_test_data <- generate_test_data(n_rows = 500, n_cols = 1, case = 1)
x <- res_test_data$X
exptilt_config <- exptilt_engine(
y_dens = 'normal',
control = list(maxit = 10),
stopping_threshold = 0.1,
standardize = FALSE,
family = 'logit',
bootstrap_reps = 5
)
formula = Y ~ x1
res <- nmar(formula = formula, data = x, engine = exptilt_config, trace_level = 1)
summary(res)
Nonparametric Exponential Tilting (Internal Generic)
Description
Nonparametric Exponential Tilting (Internal Generic)
Usage
exptilt_nonparam(data, ...)
Arguments
data |
A data.frame or survey.design object |
... |
Other arguments passed to methods |
Value
An engine-specific NMAR result object for the nonparametric exponential tilting estimator.
Nonparametric exponential tilting engine for NMAR
Description
Constructs a configuration for the nonparametric exponential tilting estimator
under nonignorable nonresponse.
This engine implements the "Fully Nonparametric Approach" from **Appendix 2**
of Riddles et al. (2016). The estimator uses an
Expectation-Maximization (EM) algorithm to directly estimate the
nonresponse odds O(x_1, y) for aggregated, categorical data.
Usage
exptilt_nonparam_engine(refusal_col = "", max_iter = 100, tol_value = 1e-06)
Arguments
refusal_col |
character; the column name in |
max_iter |
integer; the maximum number of iterations for the EM algorithm. |
tol_value |
numeric; the convergence tolerance for the EM algorithm. The loop stops when the sum of absolute changes in the odds matrix is less than this value. |
Details
This engine is designed for cases where all variables (outcomes $Y$,
response predictors $X_1$, and instrumental variables $X_2$) are categorical,
and the input data is pre-aggregated into strata.
The method assumes an instrumental variable X_2 is available. The
response probability is assumed to depend on X_1 and $Y$, but *not*
on X_2.
The EM algorithm iteratively solves for the nonresponse odds:
O^{(t+1)}(x_1^*, y^*) = \frac{M_{y^*x_1^*}^{(t)}}{N_{y^*x_1^*}}
where M_{y^*x_1^*}^{(t)} is the expected count of non-respondents
(calculated in the E-step) and N_{y^*x_1^*} is the observed count
of respondents for a given stratum $(x_1, y)$.
The final output from the nmar call is an object containing
data_to_return, an aggregated data frame where the original
'refusal' counts have been redistributed into the outcome columns
(e.g., 'Voted_A', 'Voted_B') as expected non-respondent counts.
Value
An engine object of class c("nmar_engine_exptilt_nonparam","nmar_engine").
This is a configuration list; it is not a fit. Pass it to nmar.
References
Minsun Kim Riddles, Jae Kwang Kim, Jongho Im A Propensity-score-adjustment Method for Nonignorable Nonresponse Journal of Survey Statistics and Methodology, Volume 4, Issue 2, June 2016, Pages 215–245. (See **Appendix 2** for this specific method).
Examples
# Test data (Riddles 2016, Table 9)
voting_data_example <- data.frame(
Gender = rep(c("Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female"), 1),
Age_group = c("20-29", "30-39", "40-49", ">=50", "20-29", "30-39", "40-49", "50+"),
Voted_A = c(93, 104, 146, 560, 106, 129, 170, 501),
Voted_B = c(115, 233, 295, 350, 159, 242, 262, 218),
Other = c(4, 8, 5, 3, 8, 5, 5, 7),
Refusal = c(28, 82, 49, 174, 62, 70, 69, 211),
Total = c(240, 427, 495, 1087, 335, 446, 506, 937)
)
np_em_config <- exptilt_nonparam_engine(
refusal_col = "Refusal",
max_iter = 100,
tol_value = 0.001
)
# Formula: Y1 + Y2 + ... ~ X1_vars | X2_vars
# Here, Y = Voted_A, Voted_B, Other
# x1 = Gender (response model)
# x2 = Age_group (instrumental variable)
em_formula <- Voted_A + Voted_B + Other ~ Gender | Age_group
results_em_np <- nmar(
formula = em_formula,
data = voting_data_example,
engine = np_em_config,
trace_level = 0
)
# View the final adjusted counts
# (Original counts + expected non-respondent counts)
print(results_em_np$data_final)
Extract top-level nleqslv arguments from a control-like list
Description
Extract top-level nleqslv arguments from a control-like list
Usage
extract_nleqslv_top(ctrl)
Default fitted values for NMAR results
Description
Returns fitted response probabilities.
Usage
## S3 method for class 'nmar_result'
fitted(object, ...)
Arguments
object |
An 'nmar_result' object. |
... |
Ignored. |
Value
A numeric vector (possibly length 0).
Formatter for engines
Description
Returns a single concise line summarizing an engine configuration.
Usage
## S3 method for class 'nmar_engine'
format(x, ...)
Arguments
x |
An engine object inheriting from 'nmar_engine'. |
... |
Unused. |
Value
A length-1 character vector.
Default formula for NMAR results
Description
Returns the estimation formula if available.
Usage
## S3 method for class 'nmar_result'
formula(x, ...)
Arguments
x |
An 'nmar_result' object. |
... |
Ignored. |
Value
A formula or 'NULL'.
Generate conditional density
Description
Generate conditional density
Usage
generate_conditional_density(model)
Arguments
model |
An internal exptilt object |
Glance summary for NMAR results
Description
One-row diagnostics for NMAR fits.
Usage
## S3 method for class 'nmar_result'
glance(x, ...)
Arguments
x |
An object of class 'nmar_result'. |
... |
Ignored. |
Value
A one-row data frame with diagnostics and metadata.
Construct logit response family
Description
Construct logit response family
Usage
logit_family()
Value
A list with components name, linkinv, mu.eta,
d2mu.deta2, and score_eta.
Prefer explicit solver_args over control-provided top-level args
Description
Prefer explicit solver_args over control-provided top-level args
Usage
merge_nleqslv_top(solver_args, control_top)
Construct EL Engine Object
Description
Construct EL Engine Object
Usage
new_nmar_engine_el(engine)
Construct for result objects
Description
Builds an 'nmar_result' list using the shared schema and validates it.
Usage
new_nmar_result(...)
Details
Engine-level constructors should call this helper with named arguments rather
than assembling result lists by hand. At minimum, engines should supply
estimate (numeric scalar) and converged (logical). All other
fields are optional:
-
estimate_name: label for the primary estimand (defaults toNA_character_if omitted). -
se: standard error for the primary estimand (defaults toNA_real_when not available). -
model,weights_info,sample,inference,diagnostics,meta,extra: lists that may be partially pecified orNULL;validate_nmar_result()will back-fill missing subfields with safe defaults. itemclass: engine-specific result subclass name, e.g."nmar_result_el"; it is combined with the parent class"nmar_result".
Calling new_nmar_result() ensures that every engine returns objects
that satisfy the shared schema and are immediately compatible with parent
S3 methods such as vcov(), confint(), tidy(),
glance(), and weights().
Construct EL Result Object
Description
Construct EL Result Object
Usage
new_nmar_result_el(
y_hat,
se,
weights,
coefficients,
vcov,
converged,
diagnostics,
inputs,
nmar_scaling_recipe,
fitted_values,
call,
formula = NULL
)
Not Missing at Random Estimation
Description
Interface for NMAR estimation.
nmar() validates basic inputs and dispatches to an engine (for example
el_engine). The engine controls the estimation method and
interprets formula. See the engine documentation for model-specific
requirements.
Usage
nmar(formula, data, engine, trace_level = 0)
Arguments
formula |
A two-sided formula. Engines support a partitioned
right-hand side via |
data |
A |
engine |
An NMAR engine configuration object created by
|
trace_level |
Integer 0-3; controls verbosity during estimation
(default
|
Value
An object of class "nmar_result" with an engine-specific subclass
(for example "nmar_result_el"). Use summary(),
se, confint(), weights(), coef(),
fitted(), and generics::tidy() / generics::glance() to
access estimates, standard errors, weights, and diagnostics.
See Also
el_engine, exptilt_engine,
exptilt_nonparam_engine, summary.nmar_result,
weights.nmar_result
Examples
set.seed(1)
n <- 200
x1 <- rnorm(n)
z1 <- rnorm(n)
y_true <- 0.5 + 0.3 * x1 + 0.2 * z1 + rnorm(n, sd = 0.3)
resp <- rbinom(n, 1, plogis(2 + 0.1 * y_true + 0.1 * z1))
if (all(resp == 1)) resp[sample.int(n, 1)] <- 0L
y_obs <- ifelse(resp == 1, y_true, NA_real_)
# Empirical likelihood engine
df_el <- data.frame(Y_miss = y_obs, X = x1, Z = z1)
eng_el <- el_engine(variance_method = "none")
fit_el <- nmar(Y_miss ~ X | Z, data = df_el, engine = eng_el)
summary(fit_el)
# Exponential tilting engine
dat_et <- data.frame(y = y_obs, x2 = z1, x1 = x1)
eng_et <- exptilt_engine(
y_dens = "normal",
family = "logit",
variance_method = "none"
)
fit_et <- nmar(y ~ x2 | x1, data = dat_et, engine = eng_et)
summary(fit_et)
# Survey design
if (requireNamespace("survey", quietly = TRUE)) {
w <- runif(n, 0.5, 2)
des <- survey::svydesign(ids = ~1, weights = ~w,
data = data.frame(Y_miss = y_obs, X = x1, Z = z1))
eng_svy <- el_engine(variance_method = "none")
fit_svy <- nmar(Y_miss ~ X | Z, data = des, engine = eng_svy)
summary(fit_svy)
}
# Bootstrap variance usage
# future.apply is optional, if installed, bootstrap may run in parallel under
# the user's future::plan()
set.seed(2)
eng_boot <- el_engine(
variance_method = "bootstrap",
bootstrap_reps = 20
)
fit_boot <- nmar(Y_miss ~ X | Z, data = df_el, engine = eng_boot)
se(fit_boot)
Format a number with fixed decimal places using nmar.digits
Description
Format a number with fixed decimal places using nmar.digits
Usage
nmar_fmt_num(x, digits = nmar_get_digits())
Format an abridged call line for printing
Description
Builds a concise one-line summary of the original call without materializing large objects (e.g., full data frames). Intended for use by print/summary methods.
Usage
nmar_format_call_line(x)
Details
Uses option 'nmar.show_call' (default TRUE). Width can be tuned via option 'nmar.call_width' (default 120).
Resolve global digits setting for printing
Description
Resolve global digits setting for printing
Usage
nmar_get_digits()
EL denominator floor
Description
Returns the small positive floor \delta used to guard the empirical
likelihood denominator D_i(\theta) away from zero.
Usage
nmar_get_el_denom_floor()
NMAR numeric settings
Description
NMAR numeric settings
Usage
nmar_get_numeric_settings()
Details
Centralized access to numeric thresholds used across the package.
- 'nmar.eta_cap': scalar > 0. Caps the response-model linear predictor to avoid extreme link values in Newton updates. Default 50. - 'nmar.grad_eps': finite-difference step size epsilon for numeric gradients of smooth functionals. Default 1e-6. - 'nmar.grad_d': relative step adjustment for numeric gradients. Default 1e-3.
Value
A named list with entries 'eta_cap', 'grad_eps', and 'grad_d'.
Internal helpers for nmar_result objects
Description
Internal helpers for nmar_result objects
Usage
nmar_result_get_estimate(x)
Polish Household Budget Data with Simulated Nonignorable Nonresponse
Description
This dataset is derived from the 'h05' dataset (Polish household budgets for 2005) found in the 'RClas' package. The original data was cleaned to remove all rows with missing values.
Usage
polish_households
Format
A data frame with 19,330 rows and 17 columns. The key variables are:
- class
TODO
- voi
TODO
- bio
TODO
- type
TODO
- d345
TODO
- d347
TODO
- d348
TODO
- d36
TODO
- d38
TODO
- d61
TODO
- noper
TODO
- income
TODO
- expenditure
TODO
- y_exp
Numeric. The **true** scaled expenditure ('expenditure / mean(expenditure)'). This is the complete study variable without missingness.
- resp
TODO
- R
Integer. The simulated response indicator (1=responded, 0=nonresponse).
- y_exp_miss
Numeric. The **observed** scaled expenditure, containing 7,778 'NA' values where 'R = 0'. This is the variable to be used as the NMAR-affected outcome.
Details
To create a realistic test case for nonignorable nonresponse (NMAR), a nonresponse mechanism was simulated and applied to the scaled expenditure variable ('y_exp').
The key simulation steps were: 1. 'y_exp' (true study variable) was created by scaling total expenditure. 2. A true response probability ('resp') was created using the logistic model: 'plogis(1 - 0.6 * y_exp)'. 3. A response indicator ('R') was simulated based on this probability. 4. The final variable 'y_exp_miss' was generated by setting 'y_exp' to 'NA' wherever 'R' was 0.
The response is **nonignorable** because the probability of missingness depends directly on the value of the expenditure variable itself.
Source
TODO
See Also
'riddles_case1', 'riddles_case2', 'riddles_case3', 'riddles_case4'
Prepare scaled matrices and moments
Description
Prepare scaled matrices and moments
Usage
prepare_nmar_scaling(
Z_un,
X_un,
mu_x_un,
standardize,
weights = NULL,
weight_mask = NULL
)
Arguments
Z_un |
response model matrix (with intercept column). |
X_un |
auxiliary model matrix (no intercept), or NULL. |
mu_x_un |
named numeric vector of auxiliary means on the original scale
(names must match |
standardize |
logical; apply standardization if TRUE. |
weights |
Optional numeric vector used for weighted scaling. |
weight_mask |
Optional logical mask or nonnegative numeric multipliers
applied to |
Value
A list with components Z, X, mu_x, and
recipe.
Print method for engines
Description
Compact summary for 'nmar_engine' objects.
Usage
## S3 method for class 'nmar_engine'
print(x, ...)
Arguments
x |
An engine object inheriting from 'nmar_engine'. |
... |
Unused. |
Value
'x', invisibly.
Print method for nmar_result
Description
Print method for nmar_result
Usage
## S3 method for class 'nmar_result'
print(x, ...)
Arguments
x |
nmar_result object |
... |
Additional parameters |
Value
'x', invisibly.
Print method for EL results
Description
Print for objects of class nmar_result_el.
Usage
## S3 method for class 'nmar_result_el'
print(x, ...)
Arguments
x |
An object of class |
... |
Ignored. |
Value
x, invisibly.
Print method for Exponential Tilting results (engine-specific)
Description
This print method is tailored for 'nmar_result_exptilt' objects and shows a concise, human-friendly summary of the estimation result together with exptilt-specific diagnostics (loss, iterations) and a compact view of the response coefficients stored in the fitted model.
Usage
## S3 method for class 'nmar_result_exptilt'
print(x, ...)
Arguments
x |
An object of class 'nmar_result_exptilt'. |
... |
Ignored. |
Value
'x', invisibly.
Print method for summary.nmar_result
Description
Print method for summary.nmar_result
Usage
## S3 method for class 'summary_nmar_result'
print(x, ...)
Arguments
x |
summary_nmar_result object |
... |
Additional parameters |
Value
'x', invisibly.
Construct probit response family
Description
Construct probit response family
Usage
probit_family()
Value
A list with components name, linkinv, mu.eta,
d2mu.deta2, and score_eta.
Riddles Simulation, Case 1: Linear Mean
Description
A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 1) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.
Usage
riddles_case1
Format
A data frame with 500 rows and 4 variables:
- x
Numeric. The auxiliary variable, x ~ Normal(0, 0.5).
- y
Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.
- y_true
Numeric. The complete, true value of 'y' before missingness was introduced.
- delta
Integer. The response indicator (1 = responded, 0 = nonresponse).
Details
This dataset was generated using the following model parameters (n = 500):
- Density for x:
x ~ Normal(mean = 0, variance = 0.5)
- Density for error:
e ~ Normal(mean = 0, variance = 0.9)
- True Model (Case 1):
y_true = -1 + x + e
- Response Model (NMAR):
logit(pi) = 0.8 - 0.2 * y_true
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.
Riddles Simulation, Case 2: Exponential Mean
Description
A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 2) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.
Usage
riddles_case2
Format
A data frame with 500 rows and 4 variables:
- x
Numeric. The auxiliary variable, x ~ Normal(0, 0.5).
- y
Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.
- y_true
Numeric. The complete, true value of 'y' before missingness was introduced.
- delta
Integer. The response indicator (1 = responded, 0 = nonresponse).
Details
This dataset was generated using the following model parameters (n = 500):
- Density for x:
x ~ Normal(mean = 0, variance = 0.5)
- Density for error:
e ~ Normal(mean = 0, variance = 0.9)
- True Model (Case 2):
y_true = -2 + 0.5 * exp(x) + e
- Response Model (NMAR):
logit(pi) = 0.8 - 0.2 * y_true
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.
Riddles Simulation, Case 3: Sine Wave Mean
Description
A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 3) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.
Usage
riddles_case3
Format
A data frame with 500 rows and 4 variables:
- x
Numeric. The auxiliary variable, x ~ Normal(0, 0.5).
- y
Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.
- y_true
Numeric. The complete, true value of 'y' before missingness was introduced.
- delta
Integer. The response indicator (1 = responded, 0 = nonresponse).
Details
This dataset was generated using the following model parameters (n = 500):
- Density for x:
x ~ Normal(mean = 0, variance = 0.5)
- Density for error:
e ~ Normal(mean = 0, variance = 0.9)
- True Model (Case 3):
y_true = -1 + sin(2 * x) + e
- Response Model (NMAR):
logit(pi) = 0.8 - 0.2 * y_true
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.
Riddles Simulation, Case 4: Cubic Mean
Description
A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 4) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable 'y'.
Usage
riddles_case4
Format
A data frame with 500 rows and 4 variables:
- x
Numeric. The auxiliary variable, x ~ Normal(0, 0.5).
- y
Numeric. The study variable with nonignorable nonresponse. 'y' contains 'NA's for nonrespondents.
- y_true
Numeric. The complete, true value of 'y' before missingness was introduced.
- delta
Integer. The response indicator (1 = responded, 0 = nonresponse).
Details
This dataset was generated using the following model parameters (n = 500):
- Density for x:
x ~ Normal(mean = 0, variance = 0.5)
- Density for error:
e ~ Normal(mean = 0, variance = 0.9)
- True Model (Case 4):
y_true = -1 + 0.4 * x^3 + e
- Response Model (NMAR):
logit(pi) = 0.8 - 0.2 * y_true
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. Journal of Survey Statistics and Methodology, 4(1), 1-31.
Run method for EL engine
Description
Run method for EL engine
Usage
## S3 method for class 'nmar_engine_el'
run_engine(engine, formula, data, trace_level = 0)
Arguments
engine |
An object of class |
formula |
A two-sided formula passed through by |
data |
A |
trace_level |
Integer 0-3 controlling verbosity. |
Value
An object of class nmar_result_el.
Parse nleqslv control list for compatibility
Description
Parse nleqslv control list for compatibility
Usage
sanitize_nleqslv_control(ctrl)
Map unscaled auxiliary multipliers to scaled space
Description
Map unscaled auxiliary multipliers to scaled space
Usage
scale_aux_multipliers(lambda_unscaled, recipe, columns)
Arguments
lambda_unscaled |
named numeric vector of auxiliary multipliers aligned to auxiliary design columns on original scale. |
recipe |
Scaling recipe of class |
columns |
character vector of auxiliary column names (order) for the scaled design. |
Value
numeric vector of multipliers in the scaled space.
Map unscaled coefficients to scaled space
Description
Map unscaled coefficients to scaled space
Usage
scale_coefficients(beta_unscaled, recipe, columns)
Arguments
beta_unscaled |
named numeric vector of coefficients for the response
model on the original scale, including an intercept named |
recipe |
Scaling recipe of class |
columns |
character vector of column names (order) for the scaled design matrix (including intercept). |
Value
numeric vector of coefficients in the scaled space, ordered by columns.
Extract standard error for NMAR results
Description
Returns the standard error of the primary mean estimate.
Usage
se(object, ...)
Arguments
object |
An 'nmar_result' or subclass. |
... |
Ignored. |
Value
Numeric scalar.
Weighted linear algebra
Description
Compute X' diag(w) X efficiently. If w >= 0, use SPD crossprod(X*sqrt(w)). Otherwise, fall back to X' (diag(w) X) via crossprod(X, X*w).
Usage
shared_weighted_gram(X, w)
Summary method for nmar_result
Description
Summary method for nmar_result
Usage
## S3 method for class 'nmar_result'
summary(object, conf.level = 0.95, ...)
Arguments
object |
nmar_result object |
conf.level |
Confidence level for intervals. |
... |
Additional parameters |
Value
An object of class 'summary_nmar_result'.
Summary method for EL results
Description
Summarize estimation, standard error and missingness-model coefficients.
Usage
## S3 method for class 'nmar_result_el'
summary(object, ...)
Arguments
object |
An object of class |
... |
Ignored. |
Value
An object of class summary_nmar_result_el.
Summary method for Exponential Tilting results (engine-specific)
Description
Summarize estimation, standard error and model coefficients.
Usage
## S3 method for class 'nmar_result_exptilt'
summary(object, conf.level = 0.95, ...)
Arguments
object |
An object of class 'nmar_result_exptilt'. |
conf.level |
Confidence level for confidence interval (default 0.95). |
... |
Ignored. |
Value
An object of class 'summary_nmar_result_exptilt'.
Tidy summary for NMAR results
Description
Return a data frame with the primary estimate and missingness-model coefficients.
Usage
## S3 method for class 'nmar_result'
tidy(x, conf.level = 0.95, ...)
Arguments
x |
An object of class 'nmar_result'. |
conf.level |
Confidence level for the primary estimate. |
... |
Ignored. |
Value
A data frame with one row for the primary estimate and, when available, additional rows for the response-model coefficients.
Trim weights by capping and proportional redistribution
Description
Applies a cap to a nonnegative weight vector and, when feasible, redistributes excess mass across the remaining positive entries so that the total sum is preserved. When the requested cap is too tight to preserve the total mass, all positive entries are set to the cap and the total sum decreases.
Usage
trim_weights(weights, cap, tol = 1e-12, warn_tol = 1e-08)
Arguments
weights |
numeric vector of weights. |
cap |
positive numeric scalar; maximum allowed weight, or |
tol |
numeric tolerance used when testing whether a rescaling step respects the cap. |
warn_tol |
numeric tolerance used when testing whether the total sum has been preserved. |
Details
Zero weights remain zero. Only entries that are positive after nonnegativity enforcement can absorb redistributed mass.
Internally, a simple water-filling style algorithm is used on the positive weights: the largest weights are successively saturated at the cap and the remaining weights are rescaled by a common factor chosen to maintain the total sum.
Value
A list with components:
weightsnumeric vector of trimmed weights.
trimmed_fractionfraction of entries at or very close to the cap (within
tol).preserved_sumlogical;
TRUEif the total sum of weights is preserved to withinwarn_tol.total_beforenumeric; sum of the original weights.
total_afternumeric; sum of the trimmed weights.
Unscale coefficients and covariance
Description
Unscale coefficients and covariance
Usage
unscale_coefficients(scaled_coeffs, scaled_vcov, recipe)
Arguments
scaled_coeffs |
named numeric vector of coefficients estimated on the scaled space. |
scaled_vcov |
covariance matrix of |
recipe |
Scaling recipe of class |
Value
A list with components coefficients and vcov.
Validate and apply scaling for engines
Description
Validate and apply scaling for engines
Usage
validate_and_apply_nmar_scaling(
standardize,
has_aux,
response_model_matrix_unscaled,
aux_matrix_unscaled,
mu_x_unscaled,
weights = NULL,
weight_mask = NULL
)
Arguments
standardize |
logical; apply standardization if TRUE. |
has_aux |
logical; whether the engine uses auxiliary constraints. |
response_model_matrix_unscaled |
response model matrix (with intercept). |
aux_matrix_unscaled |
auxiliary matrix (no intercept) or an empty matrix. |
mu_x_unscaled |
named auxiliary means on original scale, or NULL. |
weights |
Optional numeric vector used for weighted scaling. |
weight_mask |
Optional logical mask or nonnegative numeric multipliers
applied to |
Value
A list with components nmar_scaling_recipe,
response_model_matrix_scaled, auxiliary_matrix_scaled, and mu_x_scaled.
Validate Data for NMAR Analysis
Description
Little sanity-check for data
Usage
validate_data(data)
Arguments
data |
A data frame or a survey object. |
Value
Returns 'invisible(NULL)' on success, stopping with a descriptive error on failure.
Validate top-level nleqslv arguments and coerce invalid to defaults
Description
Validate top-level nleqslv arguments and coerce invalid to defaults
Usage
validate_nleqslv_top(top)
Validate EL Engine Settings
Description
Validate EL Engine Settings
Usage
validate_nmar_engine_el(engine)
Validate nmar_result
Description
Ensures both the child class and the parent schema are satisfied. The validator also back-fills defaults so downstream code can rely on the presence of optional components without defensive checks.
Usage
validate_nmar_result(x, class_name)
Details
This helper is the single authority on the 'nmar_result' schema. It expects
a list that already carries class c(class_name, "nmar_result") and
at least a primary estimate stored in y_hat. All other components are
optional. When they are NULL or missing, the validator supplies safe
defaults:
Core scalars:
se(numeric, defaultNA_real_),estimate_name(character, defaultNA_character_),converged(logical, defaultNA).-
model: list withcoefficientsandvcov, both defaulting toNULL. -
weights_info: list withvalues(defaultNULL) andtrimmed_fraction(defaultNA_real_). -
sample: list withn_total,n_respondents,is_survey, anddesign, defaulted to missing/empty values. -
inference: list withvariance_method,df, andmessage, all defaulted to missing values. -
diagnostics,meta, andextra: defaulted to empty lists, withmetacarryingengine_name,call, andformulawhen unset.
Engine constructors should normally call new_nmar_result() rather than
invoking this function directly. new_nmar_result() attaches classes and
funnels all objects through validate_nmar_result() so downstream S3
methods can assume a consistent structure.
Variance-covariance for NMAR results
Description
Variance-covariance for NMAR results
Usage
## S3 method for class 'nmar_result'
vcov(object, ...)
Arguments
object |
An object of class 'nmar_result'. |
... |
Ignored. |
Value
A 1x1 numeric matrix (the variance of the primary estimate).
Aggregated Exit Poll Data for Gangdong-Gap (2012)
Description
This dataset contains the aggregated exit poll results for the Gangdong-Gap district in Seoul from the 2012 nineteenth South Korean legislative election. The data is transcribed directly from Table 9 of Riddles, Kim, and Im (2016).
Usage
voting
Format
A data frame with 8 rows and 7 variables:
- Gender
Factor. The gender of the voter ("Male", "Female").
- Age_group
Character. The age group of the voter.
- Voted_A
Numeric. Count of respondents voting for Party A.
- Voted_B
Numeric. Count of respondents voting for Party B.
- Other
Numeric. Count of respondents voting for another party.
- Refusal
Numeric. Count of sampled individuals who refused to respond (this is the nonresponse count).
- Total
Numeric. Total individuals sampled in the group (Responders + Refusals).
Details
In the paper's application, 'Gender' is used as the nonresponse instrumental variable and 'Age_group' is the primary auxiliary variable .
Source
Riddles, M. K., Kim, J. K., & Im, J. (2016). A Propensity-Score-Adjustment Method for Nonignorable Nonresponse. *Journal of Survey Statistics and Methodology*, 4(1), 1–31. (Data from Table 9, p. 20).
Extract weights from an 'nmar_result'
Description
Return analysis weights stored in an 'nmar_result' as either probability-scale (summing to 1) or population-scale (summing to 'sample$n_total'). The function normalizes stored masses and attaches informative attributes.
Usage
## S3 method for class 'nmar_result'
weights(object, scale = c("probability", "population"), ...)
Arguments
object |
An 'nmar_result' object. |
scale |
One of '"probability"' (default) or '"population"'. |
... |
Additional arguments (ignored). |
Value
Numeric vector of weights with length equal to the number of respondents.