fitting_chains.Rmd

---
title: "Computation Time Verification & Fitting Chains for Data Analysis"
output:
  pdf_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(LaplacesDemon)
library(BayesLogit) 
library(Matrix)
library(tidyverse)
```

# Overview 

This notebook was used to

1. get an approximate runtime required to fit the models on North Carolina data
2. fit models and also record the runtime for the S.Atlantic Census Division. These chains were saved and used for the data analysis (Section 6 of the manuscript)

This was run on a M1 Macbook Pro (2021). 

Run on: `r Sys.Date()`. 


The samplers are loaded here

```{r}
source("samplers/fh_fit.R")
source("samplers/dm_fit.R")
source("samplers/car_fit.R")
source("samplers/bym_fit.R")
source("samplers/ssd_fit.R")
```


# North Carolina 

## Load Data 

```{r warning=FALSE}
#north carolina 
load("data/data_NC/data.RDA")

#data 
covs = c("degree", "assistance", "no_car", "povPerc",  
         "white", "black", "native", "asian", "hispanic")
X <- model.matrix(~., all_data[, covs, drop=F]) 
n <- nrow(X); j <- ncol(X) # number of covariates INCLUDES intercept 

#response (with transformation)
y <- all_data$rentBurden
d.var <- all_data$rentBurdenSE^2
y.star <- log(y)
d.star <- d.var / y^2
```


## Run samplers

```{r n_carolina}
set.seed(7)
# ---- Run Chains ----
start <- proc.time()
fh_res <- fh_fit(X, y.star, d.star, ndesired=2500, nburn=10000, nthin=1, verbose=FALSE)
fh_time <- proc.time() - start

start <- proc.time()
dm_res <- dm_fit(X, y.star, d.star, ndesired=2500, nburn=10000, nthin=1, verbose=FALSE)
dm_time <- proc.time() - start

start <- proc.time()
car_res <- car_fit(X, y.star, d.star, A, ndesired=2500, nburn=2000, nthin=1, verbose=FALSE)
car_time <- proc.time() - start

start <- proc.time()
bym_res <- bym_fit(X, y.star, d.star, A, ndesired=2500, nburn=2000, nthin=1, verbose=FALSE)
bym_time <- proc.time() - start

start <- proc.time()
new_res <- ssd_fit(X, y.star, d.star, A, ndesired=2500, nburn=2000, nthin=1, verbose=FALSE)
ssd_time <- proc.time() - start
```


Computation time **in seconds**: 

```{r echo=FALSE}
print("--- FH Computation time ---")
fh_time

print("--- DM Computation time ---")
dm_time

print("--- CAR Computation time ---")
car_time

print("--- BYM Computation time ---")
bym_time

print("--- SSD Computation time ---")
ssd_time
```


# South Atlantic Census Division

## Load Data 

```{r warning=FALSE, message=FALSE}
library(tidycensus)
#south atlantic census division
load("data/data_multi-state/data.RDA")
all_data = all_data %>% mutate(state_code=strtrim(all_data$fips, 2)) %>% 
  left_join(fips_codes %>% 
              select(state, state_code, state_name) %>% 
              unique.data.frame(), by="state_code") 

#data 
covs = c("degree", "povPerc", "black", "state")
X <- model.matrix(~., all_data[, covs, drop=F]) 
n <- nrow(X); j <- ncol(X) # number of covariates INCLUDES intercept 

#response (with transformation)
y <- all_data$rentBurden
d.var <- all_data$rentBurdenSE^2
y.star <- log(y)
d.star <- d.var / y^2
```


## Run samplers

```{r s_atlantic}
set.seed(7)
# ---- Run Chains ----
start <- proc.time()
fh_res <- fh_fit(X, y.star, d.star, ndesired=2500, nburn=10000, nthin=1, verbose=FALSE)
fh_time <- proc.time() - start

start <- proc.time()
dm_res <- dm_fit(X, y.star, d.star, ndesired=2500, nburn=10000, nthin=1, verbose=FALSE)
dm_time <- proc.time() - start

start <- proc.time()
car_res <- car_fit(X, y.star, d.star, A, ndesired=2500, nburn=2000, nthin=1, verbose=FALSE)
car_time <- proc.time() - start

start <- proc.time()
bym_res <- bym_fit(X, y.star, d.star, A, ndesired=2500, nburn=2000, nthin=1, verbose=FALSE)
bym_time <- proc.time() - start

start <- proc.time()
new_res <- ssd_fit(X, y.star, d.star, A, ndesired=2500, nburn=2000, nthin=1, verbose=FALSE)
ssd_time <- proc.time() - start
```


Computation time **in seconds**: 

```{r echo=FALSE}
print("--- FH Computation time ---")
fh_time

print("--- DM Computation time ---")
dm_time

print("--- CAR Computation time ---")
car_time

print("--- BYM Computation time ---")
bym_time

print("--- SSD Computation time ---")
ssd_time
```

Computation time in **minutes (spatial models)**: 

```{r echo=FALSE}
print("--- CAR Computation time (minutes)---")
car_time/60

print("--- BYM Computation time (minutes)---")
bym_time/60

print("--- SSD Computation time (minutes)---")
ssd_time/60
```


## Save the chains for S.Atlantic Division 

To be used for the data analysis. 

```{r}
save(fh_res, file="data_analysis_chains/fh_res.RDA")
save(dm_res, file="data_analysis_chains/dm_res.RDA")
save(car_res, file="data_analysis_chains/car_res.RDA")
save(bym_res, file="data_analysis_chains/bym_res.RDA")
save(new_res, file="data_analysis_chains/new_res.RDA")
```