Abstract
Feature engineering is the task of improving predictive modelling performance on a dataset by
transforming its feature space. Existing approaches to automate this process rely on either transformed feature space exploration through
evaluation-guided search, or explicit expansion of
datasets with all transformed features followed by
feature selection. Such approaches incur high computational costs in runtime and/or memory. We
present a novel technique, called Learning Feature
Engineering (LFE), for automating feature engineering in classification tasks. LFE is based on
learning the effectiveness of applying a transformation (e.g., arithmetic or aggregate operators) on
numerical features, from past feature engineering
experiences. Given a new dataset, LFE recommends a set of useful transformations to be applied
on features without relying on model evaluation or
explicit feature expansion and selection. Using a
collection of datasets, we train a set of neural networks, which aim at predicting the transformation
that impacts classification performance positively.
Our empirical results show that LFE outperforms
other feature engineering approaches for an overwhelming majority (89%) of the datasets from various sources while incurring a substantially lower
computational cost