Created additional vignette for calib. weighting

maciejostapiuk · Sep 4, 2024 · 220142a · 220142a
1 parent dab9f86
commit 220142a
Show file tree

Hide file tree

Showing 5 changed files with 189 additions and 4 deletions.
diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html
diff --git a/docs/LICENSE.html b/docs/LICENSE.html
diff --git a/docs/authors.html b/docs/authors.html
diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml
@@ -2,8 +2,9 @@ pandoc: 3.1.11
 pkgdown: 2.1.0
 pkgdown_sha: ~
 articles:
+  calibration_weighting: calibration_weighting.html
   theory: theory.html
-last_built: 2024-08-23T10:28Z
+last_built: 2024-09-04T21:24Z
 urls:
   reference: https://maciejostapiuk.github.io/MNAR/reference
   article: https://maciejostapiuk.github.io/MNAR/articles
diff --git a/vignettes/calibration_weighting.Rmd b/vignettes/calibration_weighting.Rmd
@@ -0,0 +1,181 @@
+---
+title: "Calibration weighting"
+author: "Maciej Ostapiuk and Maciej Beręsewicz"
+output: 
+    html_vignette:
+        df_print: kable
+        toc: true
+        number_sections: true
+vignette: >
+  %\VignetteIndexEntry{Calibration weighting}
+  %\VignetteEncoding{UTF-8}
+  %\VignetteEngine{knitr::rmarkdown}
+editor_options: 
+  chunk_output_type: console
+bibliography: references.bib
+link-citations: true
+header-includes: 
+  - \usepackage{amsmath}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>"
+)
+```
+\newcommand{\bX}{\boldsymbol{X}}
+\newcommand{\bx}{\boldsymbol{x}}
+\newcommand{\bY}{\boldsymbol{Y}}
+\newcommand{\by}{\boldsymbol{y}}
+\newcommand{\bh}{\boldsymbol{h}}
+\newcommand{\bg}{\boldsymbol{g}}
+\newcommand{\bH}{\boldsymbol{H}}
+\newcommand{\ba}{\boldsymbol{a}}
+\newcommand{\bp}{\boldsymbol{p}}
+\newcommand{\bA}{\boldsymbol{A}}
+\newcommand{\bw}{\boldsymbol{w}}
+\newcommand{\bd}{\boldsymbol{d}}
+\newcommand{\bZ}{\boldsymbol{Z}}
+\newcommand{\bz}{\boldsymbol{z}}
+\newcommand{\bv}{\boldsymbol{v}}
+\newcommand{\bb}{\boldsymbol{b}}
+\newcommand{\bu}{\boldsymbol{u}}
+\newcommand{\bU}{\boldsymbol{U}}
+\newcommand{\bQ}{\boldsymbol{Q}}
+\newcommand{\bG}{\boldsymbol{G}}
+\newcommand{\HT}{\text{\rm HT}}
+
+\newcommand{\bbeta}{\boldsymbol{\beta}}
+\newcommand{\balpha}{\boldsymbol{\alpha}}
+\newcommand{\btau}{\boldsymbol{\tau}}
+\newcommand{\bgamma}{\boldsymbol{\gamma}}
+\newcommand{\bGamma}{\boldsymbol{\Gamma}}
+\newcommand{\btheta}{\boldsymbol{\theta}}
+\newcommand{\blambda}{\boldsymbol{\lambda}}
+\newcommand{\bPhi}{\boldsymbol{\Phi}}
+\newcommand{\bEta}{\boldsymbol{\eta}}
+
+\newcommand{\bZero}{\boldsymbol{0}}
+\newcommand{\indicator}{\mathmybb{1}}
+
+\newcommand{\colvec}{\operatorname{colvec}}
+\newcommand{\logit}{\operatorname{logit}}
+\newcommand{\Exp}{\operatorname{Exp}}
+\newcommand{\Ber}{\operatorname{Bernoulli}}
+\newcommand{\Uni}{\operatorname{Uniform}}
+\newcommand{\argmin}{\operatorname{argmin}}
+
+
+
+# Weighting methods
+Usually, if non-responses occur, summation in (\ref{eq:HT estimator}) provides underestimated values compared to the population total from (\ref{eq:Pop. total}). Thus, it is needed to perform correction of initial weights $d_k$ under given sampling design - in other words, we have to perform \textbf{calibration} of described $d_k$'s.
+
+In general we have the following settings:
+- **Information on the target variable $Y$ is only available for respondents.**
+
+- **Information on auxiliary variables $X$ is available under the following settings:**
+  - Unit-level data is available for respondents and non-respondents.
+  - Unit-level data is available only for respondents, but we have population totals for the reference population.
+
+## Case when dimensions of calibration and response-model variables coincide
+
+Let $\bx= (x_1, x_2, ..., x_p)^{\text{T}}$ denote benchmark vector of chosen auxiliary variables and $\bx_{k} = (x_{1_k}, x_{2_k}, ..., x_{p_k})^{\text{T}}$ is the vector of auxiliary variables for $k$-th element of the sample $s$. Settings state that $\bX$, which is the vector of global auxiliary variables' values is known, i.e.
+\begin{equation}\label{eq:Aux. Total}
+    \bX = \left(\sum_{k=1}^{N}{x_{1_k}}, \sum_{k=1}^{N}{x_{2_k}}, ..., \sum_{k=1}^{N}{x_{p_k}}\right)^{\text{T}} = \sum_{k \in U}{\bx_k}. 
+\end{equation}
+If any of auxiliary values total is not known, one might use $x_{i_{k}}$ instead of $y_k$'s into (\ref{eq:HT estimator}), i.e. $$\hat{X}^{i}_{\text{HT}} = \sum_{k=1}^{N}{x_{i_k}}, \; i= 1, ..., p.$$
+However, using $\bx_k$ instead of $y_k$ does not always work in process of estimation $\bX$. One needs to perform slightly different weights than $d_k$'s. Those weights, denoted as $w_k$'s as solutions to optimization problem of form 
+\begin{equation}\label{eq: optimization w_k}
+    \argmin_{w_k}{\sum_{k\in r}G_k\left(w_k,d_k\right)},
+\end{equation}
+where $G_k$ is a strictly convex, differentiable function, for which $G_k(d_k,d_k) = 0$ and $G_k(1) = G'_k(1) = 0$. Also, there exists   a additional condition which has to be satisfied, namely:
+\begin{equation}\label{eq: calib eq}
+    \sum_{k\in r}{w_k\bx_k} = \sum_{k\in U}{\bx_k}.
+\end{equation}
+Equation (\ref{eq: calib eq}) is also being called as \textbf{calibration equation}. Using Lagrange multipliers method, it is shown in @deville1992calibration, that vector of calibration weights might be written as:
+\begin{equation}\label{eq: w_k minimizers form}
+    w_k = d_k F_k (\blambda^{\text{T}}\bz_k)
+\end{equation}
+where $\bz_k$ is a vector of instrumental variables, coinciding, in sense of dimensions with $\bx_k$. Later in this paper, we will consider situation where $\bz_k$ has got higher dimension than $\bx_k$. $F_k$ is the inverse of $G_k'(w_k, d_k)$, defined as:
+\begin{equation}\label{eq: partial derivative of G_k}
+    G_k'(w_k, d_k) = \frac{\partial{G_k(w_k, d_k)}}{\partial w_k}.
+\end{equation}
+There are various ideas to choose function $G_k$ but it is a common case to consider $G_k$ of form:
+\begin{equation}\label{eq: G_k example}
+    G_k(w_k,d_k) = \frac{\left(w_k - d_k\right)^2}{2d_k}
+\end{equation}
+For such choice, the solution $w_k$ of problem stated in (\ref{eq: optimization w_k}) is expressed by @estevaofunctional2000 as:
+\begin{equation}\label{eq:G_k optimizers}
+w_k = d_k(1 + \bz_k^{\text{T}}\blambda),
+\end{equation}
+where $\bg$ is defined as follows:
+\begin{equation}\label{eq: g vector}
+    \bg = \left(\sum_{k \in r}{d_k\bx_k\bz_k^{\text{T}}}\right)^{-1} \times \left(\bX - \sum_{k\in r}{d_k \bx_k}\right).
+\end{equation}
+Using obtained $w_k$, known as the linear weights, a new, so called "calibration-weighted" estimator of target variable total from (\ref{eq:Pop. total}) is of the form:
+\begin{equation}\label{eq: calibration-weighted estimator}
+    \hat{Y}_{\text{cal}} = \sum_{k \in r}{w_k y_k},
+\end{equation}
+which can be rendered as:
+\begin{equation}\label{eq: rendered calibration-weighted estimator}
+    \hat{Y}_{\text{cal}} = \sum_{k \in r }{d_k y_k} + \left(\bX - \sum_{k\in r}{d_k \bx_k}\right)\bb,
+\end{equation}
+where 
+\begin{equation*}
+\bb = \left(\sum_{k \in r}{d_k\bz_k\bx_k^{\text{T}}}\right)^{-1} \times \sum_{k \in r}{d_k\bz_k y_k}.
+\end{equation*}
+Notice, that $\hat{Y}_{cal}$ is no longer unbiased by design. However it might be consistent, which is described in @isakifuller1982.
+
+\noident How does one formulate the prediction model in this case? Let's denote two indicator random variables:
+\begin{equation*}
+\displaystyle I_j = 1{j \in U} \;\;\; R_j = 1{k\in r}.
+\end{equation*}
+@kottweightingnonignorable2010 proposed the double-protection  justiﬁcation set of equations:
+\begin{equation}\label{eq: double-security-pred.framework}
+\left\{\begin{array}{lll} 
+y_k &= \bx_k^{\text{T}} \bbeta_{\bx} + \epsilon_k\\
+\bz_k &= \bx_k^{\text{T}}\bGamma + \boldsymbol{\eta}_k^{\text{T}},\\
+\end{array} \right.
+\end{equation}
+where $\bGamma$ is usually on full-rank (not necessarily), $\bbeta_{\bx}$ is a coefficients vector and
+\begin{equation}
+E{\left(\epsilon_k|\bx_j,I_j,R_j\right)} = 0, \;\; E{\left(\boldsymbol{\eta_k}|\bx_j,I_j,R_j\right)} = 0.
+\end{equation}
+Under proposal from (\ref{eq: double-security-pred.framework}) there is a property in form of:
+\begin{equation}\label{eq: property of 2-sec prediction}
+\left(y_k - \bz_k^{\text{T}}\bbeta_{\bz}\right)|\bx_k = \left(\epsilon - \boldsymbol{\eta}_k^{\text{T}}\bGamma^{-1}\bbeta_{\bx}\right)|\bx_k,
+\end{equation}
+where 
+\begin{equation*}
+\beta_{\bz} = \Gamma^{-1}\beta_{\bx}.
+\end{equation*}
+
+## When there are more calibration than response-model variables
+
+First, lets consider $\bb_{\bz}^{*}$, a asymptotic limit of:
+\begin{equation}\label{eq: b_z}
+    \bb_{\bz} =  \left(\sum_{k \in S}{d_kR_kF'_k(\bx_k^{\text{T}}\blambda)\bx_k}\bz_k^{\text{T}}\right)^{-1} \times \sum_{k \in S}{d_kR_kF'_k(\bx_k^{\text{T}}\blambda)\bx_ky_k},
+\end{equation}
+which, alongside with $\bb_{\bz}$, is said to exist apart from result of the prediction. When prediction model fails, we got $\bb_{\bz} - \bbeta_{\bz}$ as long as $\blambda$ converged to a finite $\blambda^{*}$. @chang_using_2008 considered this case and extended weighting approach by replacing the reformulated calibration equation from (\ref{eq: calib eq}):
+\begin{equation}\label{eq: reformulated calib. eq.}
+\boldsymbol{s} = \frac{1}{N} \left[\sum_{k \in S}{d_kR_kF_k(\bx_k^{\text{T}}\blambda)\bz_k}\bx_k- \sum_{k \in S}{d_k\bx_k}\right] = \bZero
+\end{equation}
+by finding $\blambda$ that minimizes $\boldsymbol{s}^{\text{T}}\bQ\boldsymbol{s}$ for some symmetric and positive $\bQ$. There are various ways to pick $\bQ$ as well as dealing with $\bGamma$ not being on full rank. Couple of examples might be found in @kottliaoalowingmorecalib2017. For example, one of the options is to use $\displaystyle \mathbf{\bQ}^{-1} = \text{DIAG}\left[\left({N}^{-1} \sum_{S} d_k \mathbf{\bx}_k\right) \left({N}^{-1}\sum_{S} d_k \mathbf{\bx}_k^\top \right)\right]$. After finding $\blambda$, dimension of $\bx_k^{\text{T}}$ is reduced in such way:
+\begin{align*}
+\tilde{\bx}_k^T &= N^{-1} \bQ \sum_{j \in S} d_j R_j F_k' \left( \bz_j^T \blambda \right) \bx_j \bz_j^T\\
+&= \bx_k^T \left( \sum_{j \in S} d_j R_j F_k' \left( \bz_j^T \blambda \right) \bx_j \bx_j^T \right)^{-1} \sum_{j \in S} d_j R_j F_k' \left( \bz_j^T \mathbf{g} \right) \bx_j \bz_j^T\\
+&= \bx_k^T \mathbf{B}_{\bz}.
+\end{align*}
+Another approach to component reduction, proposed by @andridge2011proxy, works without searching $\blambda$ or does not even rely on picking $\bQ$ matrix- idea relies on satisfying
+\begin{equation}\label{eq: reformulated calib.eq}
+\sum_{k \in S} w_k \tilde{\bx}_k = \sum_{k \in S} d_k R_k F_k \tilde{\bx}_k = \sum_{k \in S} d_k \tilde{\bx}_k
+\end{equation}
+and setting
+\begin{equation}\label{eq: setting A&L}
+\tilde{\bx}_k^{\text{T}} = \bx_k^{\text{T}}\bA^{\text{T}},
+\end{equation}
+where $\bA^{\text{T}} =\left(\sum_{S}{R_j\bx_j\bx_j^{\text{T}}}\right)^{-1} \sum_{S}{R_j\bx_j\bz_j^{\text{T}}}$. Again, the reduction of dimensions is needed and with such obtained $\tilde{\bx}_k$ one is able to perform generalized calibration weighting technique. By far, this method is implemented as a part of `MNAR:gencal()` function.
+
+# References
+