Quick Start Guide

This guide will get you up and running with DeepCausalMMM in just a few minutes.

Basic Usage

Here’s a complete example of training a DeepCausalMMM model:

import numpy as np
from deepcausalmmm import get_device
from deepcausalmmm.core import get_default_config
from deepcausalmmm.core.trainer import ModelTrainer
from deepcausalmmm.core.data import UnifiedDataPipeline

# 1. Prepare your data (or load from CSV and reshape)
# Data shape: [n_regions, n_weeks, n_channels]
X_media = np.random.uniform(100, 5000, (10, 52, 5))    # Media spend/impressions
X_control = np.random.uniform(0, 1, (10, 52, 3))       # Control variables
y = np.random.uniform(1000, 10000, (10, 52))           # Target variable

# 2. Get optimized configuration
config = get_default_config()

# 3. Check device availability
device = get_device()
print(f"Using device: {device}")

# 4. Process data with unified pipeline
pipeline = UnifiedDataPipeline(config)
train_data, holdout_data = pipeline.temporal_split(X_media, X_control, y)
train_tensors = pipeline.fit_and_transform_training(train_data)
holdout_tensors = pipeline.transform_holdout(holdout_data)

# 5. Train with ModelTrainer (recommended approach)
trainer = ModelTrainer(config)
model = trainer.create_model(
    n_media=train_tensors['X_media'].shape[2],
    n_control=train_tensors['X_control'].shape[2],
    n_regions=train_tensors['X_media'].shape[0]
)
trainer.create_optimizer_and_scheduler()

results = trainer.train(
    train_tensors['X_media'], train_tensors['X_control'],
    train_tensors['R'], train_tensors['y'],
    holdout_tensors['X_media'], holdout_tensors['X_control'],
    holdout_tensors['R'], holdout_tensors['y'],
    pipeline=pipeline,
    verbose=True
)

# 6. View results
print(f"Training RMSE: {results['final_train_rmse']:.0f}")
print(f"Training R²: {results['final_train_r2']:.3f}")
print(f"Holdout RMSE: {results['final_holdout_rmse']:.0f}")
print(f"Holdout R²: {results['final_holdout_r2']:.3f}")

One-Command Analysis

For the complete analysis with dashboard generation:

# Clone the repository first (examples not included in pip install)
git clone https://github.com/adityapt/deepcausalmmm.git
cd deepcausalmmm
pip install -e .

# Run the comprehensive dashboard
python examples/dashboard_rmse_optimized.py

This will:

  • Load and process your data

  • Train the model with optimal settings

  • Generate comprehensive visualizations

  • Create an interactive dashboard

  • Save results and model artifacts

Data Format Requirements

DeepCausalMMM expects data in NumPy array format with shape [n_regions, n_weeks, n_channels]:

  • X_media: Media spend or impressions [n_regions, n_weeks, n_media_channels]

  • X_control: Control variables [n_regions, n_weeks, n_control_variables]

  • y: Target variable [n_regions, n_weeks]

If loading from CSV, you’ll need to reshape your data. Your CSV should have this structure:

date,region,target_variable,media_channel_1,media_channel_2,...,control_var_1,control_var_2,...
2023-01-01,DMA_1,12500,1000,500,...,0.2,15,...
2023-01-08,DMA_1,13200,1200,600,...,0.1,18,...
...

To convert from CSV to the required format:

import pandas as pd
import numpy as np

df = pd.read_csv('mmm_data.csv')
regions = df['region'].unique()
weeks = df['week'].unique()

# Reshape to [regions, weeks, channels]
X_media = np.stack([
    df.pivot(index='region', columns='week', values=col).values
    for col in media_columns
], axis=-1)

X_control = np.stack([
    df.pivot(index='region', columns='week', values=col).values
    for col in control_columns
], axis=-1)

y = df.pivot(index='region', columns='week', values='target').values

Configuration Customization

Customize the model for your specific use case:

from deepcausalmmm.core import get_default_config
from deepcausalmmm.core.trainer import ModelTrainer

# Get base configuration
config = get_default_config()

# Customize for your dataset
config.update({
    'n_epochs': 2000,           # Adjust training duration
    'learning_rate': 0.005,     # Fine-tune learning rate
    'hidden_dim': 256,          # Model capacity
    'dropout': 0.1,             # Regularization
    'holdout_ratio': 0.15,      # Validation split (default 0.08)
    'burn_in_weeks': 8,         # GRU stabilization
})

# Train with custom config (see Basic Usage for full data preparation)
trainer = ModelTrainer(config)
model = trainer.create_model(n_media, n_control, n_regions)
trainer.create_optimizer_and_scheduler()
results = trainer.train(...)

NOTEARS DAG learning (optional)

The default dag_mode is 'triangular' (acyclicity via an upper-triangular mask). To learn channel ordering from data with the NOTEARS continuous penalty (Zheng et al., 2018), opt in explicitly:

config = get_default_config()
config['dag_mode'] = 'notears'
# Recommended starting points (see config.py for full list):
config['notears_warmup_epochs'] = 500   # Huber-only, then enable penalty
config['notears_lambda1'] = 0.005
config['dag_temperature'] = 0.5
config['notears_group_l1'] = 0.01

trainer = ModelTrainer(config)
# ... train as in Basic Usage ...

After training, inspect the learned adjacency with model.threshold_dag(). Verbose training prints [NOTEARS] warmup and dual-update lines.

For a full guide (config keys, training behaviour, inspection), see DAG and NOTEARS structure learning.

Model Inference

Use your trained model for predictions:

import torch

# Make predictions directly with the model
with torch.no_grad():
    y_pred, media_coeffs, _, outputs = model(
        train_tensors['X_media'],
        train_tensors['X_control'],
        train_tensors['R']
    )

# Access predictions and contributions
predictions = y_pred.cpu().numpy()
media_contributions = outputs['contributions'].cpu().numpy()
control_contributions = outputs['control_contributions'].cpu().numpy()

print(f"Predictions shape: {predictions.shape}")
print(f"Total media impact: {media_contributions.sum():.0f}")

Response Curves Analysis

Analyze non-linear saturation effects for each channel:

from deepcausalmmm.postprocess import ResponseCurveFit
import pandas as pd
import torch

# Get contributions from model
with torch.no_grad():
    y_pred, _, _, outputs = model(
        train_tensors['X_media'],
        train_tensors['X_control'],
        train_tensors['R']
    )

media_contrib = outputs['contributions'].cpu().numpy()

# Prepare channel data for first channel (example)
# ResponseCurveFit expects specific columns: 'week_monday', 'spend', 'impressions', 'predicted'
channel_spend = X_media[:, :, 0].flatten()
channel_contrib = media_contrib[:, :, 0].flatten()

channel_data = pd.DataFrame({
    'week_monday': pd.date_range('2023-01-01', periods=len(channel_spend), freq='W-MON'),
    'spend': channel_spend,
    'impressions': channel_spend,  # Use same as spend if no separate impression data
    'predicted': channel_contrib
})

# Fit response curve
fitter = ResponseCurveFit(
    data=channel_data,
    bottom_param=False,  # Assume zero response at zero spend
    model_level='Overall',  # 'Overall' or 'DMA'
    date_col='week_monday'
)

# Fit the curve and get parameters
fitted_df = fitter.fit()  # Returns DataFrame with fitted parameters

if fitted_df is not None:
    print(f"Slope: {fitter.slope:.3f}")
    print(f"Half-Saturation Point: {fitter.saturation:,.0f}")
    print(f"Top (Max Response): {fitter.top:,.0f}")
else:
    print("Curve fitting failed - check data quality")

Response curves help you:

  • Identify saturation points for each channel

  • Optimize budget allocation across channels

  • Understand diminishing returns

  • Make data-driven investment decisions

Visualization and Analysis

Generate comprehensive visualizations:

from deepcausalmmm.postprocess import ComprehensiveAnalyzer

# Define channel names
media_cols = [f'Channel_{i+1}' for i in range(5)]
control_cols = [f'Control_{i+1}' for i in range(3)]

# Initialize analyzer with required parameters
analyzer = ComprehensiveAnalyzer(
    model=model,
    media_cols=media_cols,
    control_cols=control_cols,
    output_dir='mmm_results',
    pipeline=pipeline,
    config=config
)

# Create full dashboard
# Note: analyzer.analyze_comprehensive() requires specific data format
# For complete dashboard, use examples/dashboard_rmse_optimized.py as template

This creates 14+ interactive visualizations including:

  • Performance metrics and holdout validation

  • Actual vs predicted time series

  • Channel effectiveness and coefficient distributions

  • Economic contributions and waterfall analysis

  • DAG network visualization for causal relationships

  • Individual channel analysis and trends

  • Response curves with saturation analysis

Next Steps

Common Issues

Memory Issues

Reduce batch size or use gradient checkpointing for large datasets.

Convergence Problems

Try adjusting learning rate, regularization, or number of epochs.

Poor Performance

Check data quality, feature engineering, and hyperparameter tuning.

GPU Issues

Ensure CUDA-compatible PyTorch installation and sufficient GPU memory.