Context:
‘All You Need’ supermarket is looking to increase the efficiency of their marketing campaigns, particularly the year-end gold membership offer, which is typically priced at $999 but offered to select customers at $499. The campaign’s success hinges on accurately identifying customers likely to take advantage of this offer.
Objective:
Develop a supervised learning model using decision trees and other classifiers to predict customer responses to the gold membership campaign, thus improving targeting and reducing marketing expenses.
Data:
Dataset: Derived from the supermarket’s customer database, containing 2,240 samples with 21 features, including:
Customer demographics (Age, Income, Education, Marital Status)
Purchase history (Amount spent on various products over the last two years)
Engagement metrics (Recency of purchases, Website visits, Previous campaign responses)
Methodology:
Data Preprocessing:
Removed unique identifiers and irrelevant features (e.g., CustomerID).
Encoded categorical data (Education level, Marital Status).
Handled missing values using KNN imputation.
Split data into training (70%) and testing (30%) sets.
Machine Learning Model:
Implemented multiple classifiers: Logistic Regression, Decision Trees, Random Forest, Gradient Boosting, AdaBoost, and XGBoost.
Used Scikit-Learn and XGBoost libraries for model building and evaluation.
Employed Stratified K-Folds cross-validation to ensure consistent performance across different data subsets.
Applied hyperparameter tuning using GridSearchCV and RandomizedSearchCV to optimize model parameters.
Model Evaluation:
Evaluated model performance using accuracy, recall, precision, and confusion matrix.
Compared multiple models to identify the best performing algorithm.
Selected XGBoost with RandomizedSearchCV for the final model based on highest recall and balanced performance.
Results:
The XGBoost model, tuned using RandomizedSearchCV, achieved a recall of 87% in testing. This high recall indicates the model’s effectiveness in identifying potential customers likely to purchase the gold membership. Insights from the model include key factors such as spending on gold products, recency of purchases, and catalog purchases, guiding the supermarket in refining their marketing strategies.
