Feature-engine’s categorical encoders replace variable strings by estimated or arbitrary numbers. The following image summarizes the main encoder’s functionality.
Feature-engine’s categorical encoders work only with categorical variables by default. From version 1.1.0, you have the option to set the parameter ignore_format to False, and make the transformers also accept numerical variables as input.
Most Feature-engine’s encoders will return, or attempt to return monotonic relationships between the encoded variable and the target. A monotonic relationship is one in which the variable value increases as the values in the other variable increase, or decrease. See the following illustration as examples:
Monotonic relationships tend to help improve the performance of linear models and build shallower decision trees.
Regression vs Classification
Note that while the
MeanEncoder() and the
OrdinalEncoder() will operate
with multi-class targets, but the mean of the classes may not be significant and this will
defeat the purpose of these encoding techniques.
Additional categorical encoding transformations ara available in the open-source package Category encoders.