ECON 400: Econometrics- Bellevue College
The purpose of this statistical analysis is to examine the relationship between various economic variables and the 30-Year Fixed Rate Mortgage Average in the United States (MORTGAGE30US). In this analysis, I aim to investigate the impact of these independent variables on the mortgage rate and determine which variables have a significant relationship with MORTGAGE30US. Additionally, I will address challenges such as multicollinearity and skewness in the dataset, performing appropriate techniques to mitigate their effects. The findings of this analysis will contribute to a better understanding of the factors influencing mortgage rates in the United States
The dataset used in this analysis is extracted from the Federal Reserve's Economic Data (FRED) database, covering the period from October 1, 1993, to October 31, 2022. The dataset comprises 349 observations and includes 9 variables. The dependent variable is MORTGAGE30US and the independent variables are FEDFUNDS, CPILFESL, NGDPNSAXDCUSQ, TB3MS, GS20, USSTHPI, HQMCB10YR, and LREM64TTUSM156S.
MORTGAGE30US: 30-Year Fixed Rate Mortgage Average in the United States
FEDFUNDS: Federal Funds Effective Rate CPILFESL: Consumer Price Index for All Urban Consumers: All Items Less Food and Energy in U.S. City Average NGDPNSAXDCUSQ: Nominal Gross Domestic Product for United States TB3MS: 3-Month Treasury Bill Secondary Market Rate, Discount Basis GS20: Market Yield on U.S. Treasury Securities at 20-Year Constant Maturity, Quoted on an Investment Basis USSTHPI: All-Transactions House Price Index for the United States HQMCB10YR: 10-Year High Quality Market (HQM) Corporate Bond Spot Rate LREM64TTUSM156S: Employment Rate: Aged 15-64: All Persons for the United States When analyzing a dataset for statistical purposes, it is vital to examine the correlation matrix of variables as it provides valuable insights into the strength, direction, and significance of relationships between variables. The correlation coefficient ranges from -1 to 1, where values close to 1 indicate a strong positive correlation, values close to -1 indicate a strong negative correlation, and values close to 0 indicate little to no correlation. Based on the plot of the correlation matrix provided above, there are many pairs of variables that are highly correlated with each other. It also indicates the presence of multicollinearity in the dataset.data:image/s3,"s3://crabby-images/a192c/a192c5a6b9d090a0423218e3aa5df75f408059bc" alt="image"
To address the presence of multicollinearity in the dataset, one approach is to check the Variance Inflation Factor (VIF) and drop variables with high VIF values. “As an arbitrary rule of thumb, it is often suggested that the VIF should not exceed 10.” (Dr. Lawrence, wk3)
By dropping variables with high VIF values one at a time, the final set of independent variables in the model are as follows: FEDFUNDS with VIF of 4.558909 USSTHPI with VIF of 2.255771 HQMCB10YR with VIF of 3.373104 LREM64TTUSM156S with VIF of 3.548210Three models will be conducted for analysis. The first model is an Ordinary Least Squares (OLS) model with the dependent variable (MORTGAGE30US) and independent variables including FEDFUNDS, CPILFESL, NGDPNSAXDCUSQ, TB3MS, GS20, USSTHPI, HQMCB10YR, and LREM64TTUSM156S. The second model is also an OLS model with the dependent variable and independent variables consisting of FEDFUNDS, USSTHPI, sqrt_HQMCB10YR, and sqrt_LREM64TTUSM156S. In this model, the independent variables sqrt_HQMCB10YR and sqrt_LREM64TTUSM156S are the square root transformations of HQMCB10YR and LREM64TTUSM156S. The last model will utilize Weighted Least Squares (WLS) regression. It will have the dependent variable and independent variables, including FEDFUNDS, USSTHPI, sqrt_HQMCB10YR, and sqrt_LREM64TTUSM156S.
The coefficients for FEDFUNDS, TB3MS, GS20, USSTHPI, HQMCB10YR, and
LREM64TTUSM156S are statistically significant (p < 0.05), indicating a significant
relationship between these variables and the MORTGAGE30US. This suggests that
changes in these variables are associated with changes in the mortgage rate.
The coefficients for FEDFUNDS, HQMCB10YR, sqrt_USSTHPI, and
sqrt_LREM64TTUSM156S are statistically significant (p < 0.05), indicating a significant
relationship between these variables and the MORTGAGE30US. This suggests that
these variables have a meaningful impact on the mortgage rate.
Based on the residual plots, the points appear to be randomly scattered around the
residual = 0 line. This indicates that the assumptions of linearity and constant variance
are reasonably met, suggesting that a linear model is appropriate for modeling this data.
Based on the p-value being less than 0.05, we reject the null hypothesis. Therefore, we
have sufficient evidence to conclude that heteroscedasticity is present in the regression
model. This implies that the assumption of equal variance of the error terms is violated.
By using robust standard errors, which are designed to handle heteroscedasticity, all the
coefficients in the model are statistically significant in explaining the variation in the
dependent variable. This indicates that these variables have a meaningful impact on the
outcome variable, even when accounting for heteroscedasticity.
All variables in the model have p-values less than 0.05, indicating their statistical
significance in explaining the variation in the mortgage rate. The F-statistic of 2235
and a p-value of < 2.2e-16 indicate the overall high statistical significance of the
model.
In conclusion, this statistical analysis examined the relationship between various economic variables and the 30-Year Fixed Rate Mortgage Average in the United States. Through the evaluation of three different models, I identified key variables that have a significant impact on the mortgage rate.
The results of the best model revealed that the FEDFUNDS, HQMCB10YR, sqrt_USSTHPI, and sqrt_LREM64TTUSM156S were statistically significant predictors of MORTGAGE30US. These findings suggest that changes in these variables are associated with changes in the mortgage rate. Moreover, the models demonstrated a high overall statistical significance, indicating that the independent variables collectively have a strong impact on predicting the mortgage rate. The high R-squared values of the models indicate a good fit to the data, with approximately 96% of the variance in the MORTGAGE30US explained by the independent variables. These findings provide valuable insights into the factors influencing the mortgage rate and can assist in making informed decisions in the real estate and financial sectors. However, it is important to acknowledge the limitations of the models, such as the presence of heteroscedasticity and the need for further robustness checks. Further research and analysis are recommended to validate and enhance the findings presented in this study."Economic Theory Blog." Robust Standard Errors. Economic Theory Blog, 7 Aug. 2016, https://economictheoryblog.com/2016/08/07/robust-standard-errors/#:~:text=%E2%80% 9CRobust%E2%80%9D%20standard%20errors%20is%20a,linear%20unbiased%20esti mator%20(BLUE).
"How to Interpret a Residual Plot - Explanation." Study.com, Study.com, n.d. Accessed 26 May 2023. https://study.com/skill/learn/how-to-interpret-a-residual-plot-explanation.html. FRED (Federal Reserve Economic Data)." Federal Reserve Bank of St. Louis, fred.stlouisfed.org/ "Weighted Least Squares in R - Statology." Statology, 2023, www.statology.org/weighted-least-squares-in-r/ "Skewness and Kurtosis in R - Statology." Statology, 2023, www.statology.org/skewness-kurtosis-in-r/ Wk3-Collinearity WK6-Hsk