A collection of functions that work with data frame to inspect and manipulate data; and to keep track of data manipulation by producing log files.
Available on CRAN: https://cran.r-project.org/package=cleandata
Demonstration: Wrangling Ames Housing Dataset
New in V0.3.0
- Made parameter 'log' able to take value from a 'log_arg' variable in the parent environment (dynamic scoping) of a function
- The old way of assigning a value to 'log' is also supported
- 'log' is the parameter to control producing log files
List of Functions
-
Inspection
- inspect_map: Classify The Columns of A Data Frame
- inspect_na: Find Out Which Columns Have Most NAs
- inspect_smap: A Simplified Thus Faster Version of inspect_map
-
Imputation
- impute_mean: Impute Missing Values by Mean
- impute_median: Impute Missing Values by Median
- impute_mode: Impute Missing Values by Mode
-
Encoding
- encode_binary: Encode Binary Data Into 0 and 1
- encode_ordinal: Encode Ordinal Data Into Integers
- encode_onehot: One Hot encoding
-
Partitioning
- partition_random: Partition A Dataset Randomly
-
Other
- wh_dict: Create Data Dictionary from Data Warehouse