Data Selection Intro— Filter Method, Wrapper Method and Embedded Method

3 min readApr 6, 2023

Compare the different between Filter Method, Wrapper Method and Embedded Method.

We can simply derive data selection into 3 categories — Filter Method, Wrapper Method, Embedded Method. Let’s discuss about what these method does, and their pro and cons.

Filter Method

Definition: Select features based on statistical tests or correlations

Pros:

Easy and fast to implement
Does not require fitting a model
Can handle a large number of features
Can be used as a pre-processing step before other methods
Interpretability of the selected features

Cons:

May not capture complex relationships between the features and the target variable.- Does not require fitting a model
May select irrelevant features

Examples:

Correlation
Chi-square test
ANOVA
Information Gain

Wrapper Method

Definition: Uses a predictive model to evaluate the features

Pros:

Can capture complex relationships between features and the target variable
Can select a subset of features that work well together for the given model

Cons:

Computationally expensive
May overfit the model
May miss important features
May not be easily interpretable

Examples:

Forward selection
Backward selection
Stepwise selection

Embedded Method

Definition:Learns feature importance as part of the model training process

Pros:

Can capture complex relationships between features
Considers feature interactions
Can be more efficient than wrapper methods for large datasets

Cons:

Computationally expensive during model training and the target variable.
May not perform well with certain types of models or data
May miss important features
May not be easily interpretable, especially for complex models

Here some examples of each method, and you may tell when is good for which method.

Filter Method Example:

Suppose you have a dataset with many features, and you want to select only the most relevant ones for predicting the target variable. You could use a filter method such as the correlation-based feature selection (CFS) algorithm. CFS evaluates the relevance of each feature by computing its correlation with the target variable, and also evaluates the redundancy between features by computing their correlation with each other. It then selects a subset of features that are highly relevant to the target variable, but also minimizes the redundancy among features.

Wrapper Method Example:

Suppose you want to train a machine learning model to predict whether a customer will churn (i.e., cancel their subscription) based on their demographic and behavioral data. You could use a wrapper method such as recursive feature elimination (RFE) with a logistic regression model. RFE starts with all the features and trains the logistic regression model, then removes the least important feature and trains the model again, and repeats this process until a desired number of features is reached. The features selected by RFE are those that work best with the logistic regression model for the given prediction task.

Embedded Method Example:

Suppose you want to train a deep neural network to classify images of cats and dogs. You could use an embedded method such as dropout regularization to learn the feature importance as part of the training process. Dropout randomly drops out some of the neurons during each training epoch, forcing the network to learn redundant representations of the features. The features that are more important for the classification task will tend to have a stronger presence in the network’s learned representations, and will thus be more robust to dropout. Dropout can thus be seen as a form of feature selection that is integrated into the learning process.

Reference:

Data Selection Intro— Filter Method, Wrapper Method and Embedded Method

Filter Method

Wrapper Method

Embedded Method

Filter Method Example:

Wrapper Method Example:

Embedded Method Example:

A comprehensive guide to Feature Selection using Wrapper methods in Python

In today'sder to perform any machine learning task or to get insights from such high dimensional data, feature…

Written by SharkYun