Skip to content

4. How to deal with NA values in time series data?

Hyesop edited this page Jul 31, 2018 · 2 revisions

Missing data

  • Why do some days have NA values?
  • How do I change all these values to a numeric value?

Missing data are brought from Gangnam monitoring station between 2010-2017. It is easily noticed that there must have been a problem in Gangnam in 2012, due to unknown circumstances e.g. broken signal, functional damage.

1260	2011.9.22	pm10	work	53	NA	53	53	53	53	53	17.909

1493	2012.1.17	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1494	2012.1.17	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1495	2012.1.18	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1496	2012.1.18	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1497	2012.1.19	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1498	2012.1.19	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1499	2012.1.20	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1500	2012.1.20	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1501	2012.1.21	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1502	2012.1.21	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1503	2012.1.22	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1504	2012.1.22	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1505	2012.1.23	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1506	2012.1.23	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1507	2012.1.24	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1508	2012.1.24	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1509	2012.1.25	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1510	2012.1.25	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1511	2012.1.26	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
1512	2012.1.26	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!

2025	2012.10.9	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2026	2012.10.9	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2027	2012.10.10	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2028	2012.10.10	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2029	2012.10.11	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2030	2012.10.11	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2031	2012.10.12	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2032	2012.10.12	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2033	2012.10.13	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2034	2012.10.13	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2035	2012.10.14	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2036	2012.10.14	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2037	2012.10.15	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2038	2012.10.15	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
2040	2012.10.16	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!

2580	2013.7.13	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!

3796	2015.3.13	pm10	work	54	NA	54	54	54	54	54	-0.6

4799	2016.7.27	pm10	home	50	NA	50	50	50	50	50	15.909
4800	2016.7.27	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
4807	2016.7.31	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
4808	2016.7.31	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
4810	2016.8.1	pm10	work	9	NA	9	9	9	9	9	-21.8

5275	2017.3.22	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
5276	2017.3.22	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
5277	2017.3.23	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
5278	2017.3.23	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
5279	2017.3.24	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
5280	2017.3.24	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
5281	2017.3.25	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
5282	2017.3.25	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
5283	2017.3.26	pm10	home	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!
5284	2017.3.26	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!

5332	2017.4.19	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!

5498	2017.7.11	pm10	work	NA	NA	Inf	NA	NA	NA	#NAME?	#VALUE!

Using R imputeTS package

Steffen Moritz has recently built a new package named imputeTS, which is specialised on (univariate) time series imputation.

Functions

This package offers several imputation functions which is shown below:

Function Description
na.interpolation Missing Value Imputation by Interpolation
na.kalman Missing Value Imputation by Kalman Smoothing
na.locf Missing Value Imputation by Last Observation Carried Forward
na.ma Missing Value Imputation by Weighted Moving Average
na.mean Missing Value Imputation by Mean Value
na.random Missing Value Imputation by Random Sample
na.remove Remove Missing Values
na.replace Replace Missing Values by a Defined Value
na.seadec Seasonally Decomposed Missing Value Imputation
na.seasplit Seasonally Splitted Missing Value Imputation

This is an overview but it is further discussed in the following manual (https://cran.r-project.org/web/packages/imputeTS/imputeTS.pdf).

Missing Data Plots

Here is a table with available plots to choose from:

Function Description
plotNA.distribution Visualize Distribution of Missing Values
plotNA.distributionBar Visualize Distribution of Missing Values (Barplot)
plotNA.gapsize Visualize Distribution of NA gapsizes
plotNA.imputations Visualize Imputed Values

Reference

Moritz, Steffen, and Thomas Bartz-Beielstein. "imputeTS: Time Series Missing Value Imputation in R." R Journal 9.1 (2017).