Missing Data Imputation#
Feature-engine’s missing data imputers replace missing data by parameters estimated from data or arbitrary values pre-defined by the user.
Summary of Feature-engine’s imputers main characteristics
Transformer |
Numerical variables |
Categorical variables |
Description |
---|---|---|---|
√ |
× |
Replaces missing values by the mean or median |
|
√ |
x |
Replaces missing values by an arbitrary value |
|
√ |
× |
Replaces missing values by a value at the end of the distribution |
|
√ |
√ |
Replaces missing values by the most frequent category or by an arbitrary value |
|
√ |
√ |
Replaces missing values by random value extractions from the variable |
|
√ |
√ |
Adds a binary variable to flag missing observations |
|
√ |
√ |
Removes observations with missing data from the dataset |
The CategoricalImputer()
performs procedures suitable for categorical variables. From
version 1.1.0 it also accepts numerical variables as input, for those cases were
categorical variables by nature are coded as numeric.