13.2 Categorical Variables

Purposes

  • To transform to continuous variable (for machine learning models) (e.g., encoding/ embedding in text mining)

Approaches:

  • One-hot encoding

  • Label encoding

  • Feature hashing

  • Binary encoding

  • Base N encoding

  • Frequency encoding

  • Target encoding

  • Ordinal encoding

  • Helmert encoding

  • Mean encoding

  • Weight of evidence encoding

  • Probability ratio encoding

  • Backward difference encoding

  • Leave one out encoding

  • James-Stein encoding

  • M-estimator encoding

  • Thermometer encoding