End-to-End MMM Analysis

This tutorial walks through a complete Marketing Mix Modeling workflow using DeepCausalMMM, from data generation to budget optimization.

Overview

In this tutorial, you’ll learn how to:

  1. Generate synthetic MMM data

  2. Configure and train a model

  3. Analyze attribution results

  4. Fit response curves

  5. Optimize budget allocation

This entire workflow uses synthetic data, so you can run it immediately without any external data files.

Step 1: Setup and Data Generation

First, let’s import necessary libraries and generate synthetic MMM data:

import numpy as np
import pandas as pd
import torch
from deepcausalmmm import get_device
from deepcausalmmm.core import get_default_config
from deepcausalmmm.core.trainer import ModelTrainer
from deepcausalmmm.core.data import UnifiedDataPipeline
from deepcausalmmm.utils.data_generator import ConfigurableDataGenerator

# Set random seed for reproducibility
np.random.seed(42)

# Initialize synthetic data generator
generator = ConfigurableDataGenerator()

# Generate MMM dataset
# Returns: X_media [regions, weeks, channels], X_control, y [regions, weeks]
X_media, X_control, y = generator.generate_mmm_dataset(
    n_regions=10,
    n_weeks=104,
    n_media_channels=5,
    n_control_channels=3
)

print(f"Generated data shapes:")
print(f"  Media: {X_media.shape}")      # (10, 104, 5)
print(f"  Control: {X_control.shape}")  # (10, 104, 3)
print(f"  Target: {y.shape}")           # (10, 104)

Understanding the Data

Our synthetic dataset represents:

  • 10 regions (e.g., different DMAs or geographic markets)

  • 104 weeks (~2 years of data)

  • 5 media channels (e.g., TV, Search, Social, Display, Radio)

  • 3 control variables (e.g., price, promotion, competitor spend)

  • Target variable (e.g., sales, revenue, or conversions)

Step 2: Configure the Model

DeepCausalMMM is highly configurable. Let’s start with the default configuration and customize key parameters:

# Get default configuration
config = get_default_config()

# Customize for our use case
config['n_epochs'] = 500  # Reduce for faster training (default: 1500)
config['learning_rate'] = 0.001

# Attribution prior regularization (optional but recommended)
config['media_contribution_prior'] = 0.40  # Target 40% media attribution
config['attribution_reg_weight'] = 0.5     # Balanced regularization

# Check device
device = get_device()
print(f"Training on: {device}")

Key Configuration Parameters

  • n_epochs: Number of training iterations (500-1500)

  • media_contribution_prior: Target percentage for media attribution (0.30-0.50)

  • attribution_reg_weight: Regularization strength (0.0-1.0)

  • holdout_ratio: Validation split percentage (default: 0.12)

Step 3: Prepare Data Pipeline

The UnifiedDataPipeline handles all data preprocessing:

# Initialize pipeline
pipeline = UnifiedDataPipeline(config)

# Split data temporally (training: 88%, holdout: 12%)
train_data, holdout_data = pipeline.temporal_split(X_media, X_control, y)

# Transform training data (fit scalers)
train_tensors = pipeline.fit_and_transform_training(train_data)

# Transform holdout data (use fitted scalers)
holdout_tensors = pipeline.transform_holdout(holdout_data)

print(f"Training weeks: {train_tensors['y'].shape[1]}")
print(f"Holdout weeks: {holdout_tensors['y'].shape[1]}")

What the Pipeline Does

  1. Temporal Split: Splits data by time (last 12% for validation)

  2. Scaling:

    • Media: SOV (Share of Voice) scaling

    • Controls: Z-score normalization

    • Target: Linear scaling (y/y_mean per region)

  3. Tensor Conversion: Converts to PyTorch tensors

Step 4: Create and Train Model

Now let’s create the model and train it:

# Initialize trainer
trainer = ModelTrainer(config)

# Create model
model = trainer.create_model(
    n_media=train_tensors['X_media'].shape[2],      # 5 channels
    n_control=train_tensors['X_control'].shape[2],  # 3 controls
    n_regions=train_tensors['X_media'].shape[0]     # 10 regions
)

# Create optimizer and scheduler
trainer.create_optimizer_and_scheduler()

print("Model architecture created successfully!")
print(f"Parameters: {sum(p.numel() for p in model.parameters()):,}")

Training the Model

# Train model (this will take a few minutes)
results = trainer.train(
    train_tensors['X_media'],
    train_tensors['X_control'],
    train_tensors['R'],  # Regional indicators
    train_tensors['y'],
    holdout_tensors['X_media'],
    holdout_tensors['X_control'],
    holdout_tensors['R'],
    holdout_tensors['y'],
    pipeline=pipeline,
    verbose=True
)

# View results
print("\nTraining Results:")
print(f"  Training R²: {results['final_train_r2']:.3f}")
print(f"  Holdout R²: {results['final_holdout_r2']:.3f}")
print(f"  Training RMSE: {results['final_train_rmse']:,.0f}")
print(f"  Holdout RMSE: {results['final_holdout_rmse']:,.0f}")

Understanding the Results

Performance on synthetic data varies depending on:

  • Number of epochs: More epochs (1500-2500) improve convergence

  • Data complexity: Random synthetic data is harder to fit than real data with consistent patterns

  • Regularization: Attribution priors may constrain fit quality initially

For real-world MMM data with consistent marketing patterns, expect significantly better performance (see benchmark results in README.md)

Step 5: Analyze Attribution

Extract and analyze channel contributions:

# Get model predictions and contributions
# Note: model.forward() returns a tuple: (y_pred, media_coeffs, media_contrib, outputs_dict)
with torch.no_grad():
    y_pred, media_coeffs, media_contrib_direct, outputs = model(
        train_tensors['X_media'],
        train_tensors['X_control'],
        train_tensors['R']
    )

# Extract contributions from outputs dictionary (already in original scale)
media_contrib = outputs['contributions'].cpu().numpy()  # [regions, weeks, channels]
baseline_contrib = outputs['baseline'].cpu().numpy()
seasonal_contrib = outputs['seasonal_contribution'].cpu().numpy()
control_contrib = outputs['control_contributions'].cpu().numpy()

# Calculate total contributions
total_media = media_contrib.sum()
total_baseline = baseline_contrib.sum()
total_seasonal = seasonal_contrib.sum()
total_control = control_contrib.sum()
total_predicted = y_pred.sum().cpu().numpy()  # Use y_pred from tuple return

# Calculate percentages
print("\nAttribution Breakdown:")
print(f"  Media: {(total_media/total_predicted)*100:.1f}%")
print(f"  Baseline: {(total_baseline/total_predicted)*100:.1f}%")
print(f"  Seasonal: {(total_seasonal/total_predicted)*100:.1f}%")
print(f"  Controls: {(total_control/total_predicted)*100:.1f}%")

# Verify additivity (should be ~100%)
total_pct = ((total_media + total_baseline + total_seasonal + total_control) /
             total_predicted) * 100
print(f"\nAdditivity Check: {total_pct:.2f}% (should be ~100%)")

Channel-Level Attribution

# Sum contributions across regions and time for each channel
channel_contributions = media_contrib.sum(axis=(0, 1))

# Create channel names
channel_names = [f'Channel_{i+1}' for i in range(len(channel_contributions))]

# Display results
print("\nChannel Contributions:")
for name, contrib in zip(channel_names, channel_contributions):
    pct = (contrib / total_media) * 100
    print(f"  {name}: {contrib:,.0f} ({pct:.1f}% of media)")

Step 6: Fit Response Curves

Analyze saturation and diminishing returns for each channel.

Note: Curve fitting may not succeed with limited training epochs or synthetic data. This is expected and OK - the code handles it gracefully. For production use with real data and sufficient training, curves will fit successfully.

from deepcausalmmm.postprocess import ResponseCurveFit

# Prepare data for response curve fitting
results_curves = []

# Important: Use train_data (not X_media) to match contrib dimensions
# And remove padding from contributions
burn_in_weeks = config.get('burn_in_weeks', 6)
media_contrib_no_padding = media_contrib[:, burn_in_weeks:, :]  # Remove padding

for channel_idx in range(train_data['X_media'].shape[2]):
    # Aggregate data for this channel across regions
    channel_spend = train_data['X_media'][:, :, channel_idx].sum(axis=0)  # [weeks]
    channel_contrib = media_contrib_no_padding[:, :, channel_idx].sum(axis=0)  # [weeks]

    # ResponseCurveFit expects specific column names
    df = pd.DataFrame({
        'week_monday': pd.date_range('2023-01-01', periods=len(channel_spend), freq='W-MON'),
        'spend': channel_spend,
        'impressions': channel_spend,  # Use spend as impressions if not separate
        'predicted': channel_contrib
    })

    # Fit response curve
    fitter = ResponseCurveFit(
        data=df,
        bottom_param=False,  # Assume zero response at zero spend
        model_level='Overall',  # 'Overall' for aggregated, 'DMA' for region-level
        date_col='week_monday'
    )

    try:
        # Fit the curve
        fitted_df = fitter.fit()

        if fitted_df is not None and hasattr(fitter, 'slope'):
            results_curves.append({
                'channel': f'Channel_{channel_idx+1}',
                'slope_a': fitter.slope,
                'half_saturation_g': fitter.saturation,
                'top': fitter.top
            })

            print(f"\nChannel {channel_idx+1}:")
            print(f"  Slope (a): {fitter.slope:.2f}")
            print(f"  Half-saturation (g): {fitter.saturation:,.0f}")
            print(f"  Top (max response): {fitter.top:,.0f}")
        else:
            print(f"Channel {channel_idx+1}: Curve fitting skipped - insufficient data variation (OK)")

    except Exception as e:
        print(f"Channel {channel_idx+1}: Curve fitting skipped - {e} (OK)")

# Create summary DataFrame
if len(results_curves) > 0:
    curves_df = pd.DataFrame(results_curves)
    print("\nResponse Curve Summary:")
    print(curves_df.to_string(index=False))
else:
    print("\nNote: No response curves fitted - this is OK and expected with:")
    print("  - Quick tutorial runs with limited epochs (50-100)")
    print("  - Random synthetic data without consistent marketing patterns")
    print("  - Low channel spend variation in training data")
    print("  For production: Use 1500-2500 epochs with real marketing data")

Interpreting Response Curves

Slope Parameter (a):

  • a >= 2.0: Strong S-curve with clear saturation

  • 1.0 <= a < 2.0: Moderate saturation

  • a < 1.0: Almost linear (limited saturation observed)

Half-Saturation Point (g):

  • Lower values: Channel saturates quickly

  • Higher values: Channel has more room for growth

R² Score:

  • >= 0.8: Excellent fit

  • 0.6-0.8: Good fit

  • < 0.6: Review data quality

Step 7: Budget Optimization

Use fitted response curves to optimize budget allocation:

from deepcausalmmm.postprocess import BudgetOptimizer

# Initialize optimizer
optimizer = BudgetOptimizer(
    response_curves_df=curves_df,
    method='SLSQP'
)

# Define total budget (e.g., $1M)
total_budget = 1_000_000

# Set channel-specific constraints (optional)
# Note: Underscores in numbers (e.g., 100_000) are for readability (Python 3.6+)
constraints = {
    'Channel_1': {'lower': 100_000, 'upper': 400_000},
    'Channel_2': {'lower': 150_000, 'upper': 500_000},
    'Channel_3': {'lower': 50_000, 'upper': 300_000},
    'Channel_4': {'lower': 100_000, 'upper': 400_000},
    'Channel_5': {'lower': 50_000, 'upper': 250_000},
}

# Optimize allocation
optimal_allocation = optimizer.optimize_budget(
    total_budget=total_budget,
    constraints=constraints
)

# Display results
print("\nOptimal Budget Allocation:")
for channel, amount in optimal_allocation.items():
    pct = (amount / total_budget) * 100
    print(f"  {channel}: ${amount:,.0f} ({pct:.1f}%)")

# Calculate expected ROI
expected_roi = optimizer.calculate_expected_roi(optimal_allocation)
print(f"\nExpected Total ROI: {expected_roi:.2f}x")

Optimization Tips

  1. Set Realistic Constraints: Use historical min/max as guides

  2. Try Different Methods: ‘SLSQP’ (fast) or ‘trust-constr’ (robust)

  3. Validate Results: Compare optimal allocation with historical spend

  4. Consider Seasonality: Run separate optimizations for different periods

Step 8: Save and Export Results

Save your model and results for future use:

# Save trained model
torch.save({
    'model_state': model.state_dict(),
    'config': config,
    'results': results,
    'pipeline': pipeline
}, 'mmm_model_trained.pth')

# Export attribution results
attribution_df = pd.DataFrame({
    'component': ['Media', 'Baseline', 'Seasonal', 'Controls'],
    'contribution': [total_media, total_baseline, total_seasonal, total_control],
    'percentage': [
        (total_media/total_predicted)*100,
        (total_baseline/total_predicted)*100,
        (total_seasonal/total_predicted)*100,
        (total_control/total_predicted)*100
    ]
})
attribution_df.to_csv('attribution_results.csv', index=False)

# Export response curves
curves_df.to_csv('response_curves.csv', index=False)

print("\nResults saved successfully!")

Complete Script

Here’s the complete workflow in one script:

import numpy as np
import pandas as pd
import torch
from deepcausalmmm import get_device
from deepcausalmmm.core import get_default_config
from deepcausalmmm.core.trainer import ModelTrainer
from deepcausalmmm.core.data import UnifiedDataPipeline
from deepcausalmmm.utils.data_generator import ConfigurableDataGenerator
from deepcausalmmm.postprocess import ResponseCurveFit, BudgetOptimizer

# Set seed
np.random.seed(42)

# 1. Generate data
generator = ConfigurableDataGenerator(
    base_sales=50000, n_media_channels=5, n_control_vars=3
)
X_media, X_control, y = generator.generate_mmm_dataset(
    n_regions=10, n_weeks=104
)

# 2. Configure and prepare
config = get_default_config()
config['n_epochs'] = 500
pipeline = UnifiedDataPipeline(config)
train_data, holdout_data = pipeline.temporal_split(X_media, X_control, y)
train_tensors = pipeline.fit_and_transform_training(train_data)
holdout_tensors = pipeline.transform_holdout(holdout_data)

# 3. Train model
trainer = ModelTrainer(config)
model = trainer.create_model(
    n_media=5, n_control=3, n_regions=10
)
trainer.create_optimizer_and_scheduler()
results = trainer.train(
    train_tensors['X_media'], train_tensors['X_control'],
    train_tensors['R'], train_tensors['y'],
    holdout_tensors['X_media'], holdout_tensors['X_control'],
    holdout_tensors['R'], holdout_tensors['y'],
    pipeline=pipeline, verbose=True
)

# 4. Analyze results
print(f"Training R²: {results['final_train_r2']:.3f}")
print(f"Holdout R²: {results['final_holdout_r2']:.3f}")

# 5. (Optional) Fit response curves and optimize budget
# See detailed steps above for response curve and optimization code

Next Steps

Now that you’ve completed this tutorial, you can:

  1. Try Real Data: Replace synthetic data with your actual MMM data

  2. Customize Configuration: Experiment with different hyperparameters

  3. Advanced Visualization: Run the full dashboard (examples/dashboard_rmse_optimized.py)

  4. Budget Optimization: Use the complete optimization workflow (examples/example_budget_optimization.py)

Troubleshooting

Low Holdout R²:

  • Increase training epochs

  • Adjust regularization (L1/L2 weights)

  • Check data quality

Unrealistic Attribution:

  • Set attribution priors (media_contribution_prior)

  • Increase attribution_reg_weight

  • Review channel correlation

Training Instability:

  • Reduce learning rate

  • Increase gradient clipping

  • Check for data outliers

See Also