Skip to content

Thanksgiving

Latest
Compare
Choose a tag to compare
@sherrisherry sherrisherry released this 02 Dec 03:52

A collection of functions that work with data frame to inspect and manipulate data; and to keep track of data manipulation by producing log files.

Available on CRAN: https://cran.r-project.org/package=cleandata

Demonstration: Wrangling Ames Housing Dataset

New in V0.3.0

  • Made parameter 'log' able to take value from a 'log_arg' variable in the parent environment (dynamic scoping) of a function
    • The old way of assigning a value to 'log' is also supported
    • 'log' is the parameter to control producing log files

List of Functions

  • Inspection

    • inspect_map: Classify The Columns of A Data Frame
    • inspect_na: Find Out Which Columns Have Most NAs
    • inspect_smap: A Simplified Thus Faster Version of inspect_map
  • Imputation

    • impute_mean: Impute Missing Values by Mean
    • impute_median: Impute Missing Values by Median
    • impute_mode: Impute Missing Values by Mode
  • Encoding

    • encode_binary: Encode Binary Data Into 0 and 1
    • encode_ordinal: Encode Ordinal Data Into Integers
    • encode_onehot: One Hot encoding
  • Partitioning

    • partition_random: Partition A Dataset Randomly
  • Other

    • wh_dict: Create Data Dictionary from Data Warehouse