Response Curves
The response curves module provides non-linear saturation analysis using Hill equations to model diminishing returns in marketing channels.
Overview
Response curves help you understand:
Saturation Points: When additional spend/impressions yield diminishing returns
Optimal Allocation: Which channels have room for increased investment
S-Shaped Relationships: Non-linear effects of marketing activities
Channel Efficiency: Compare saturation across different channels
Key Features
Hill Equation Fitting: Fits S-shaped saturation curves to channel data
Automatic Aggregation: Aggregates DMA-week data to national weekly level
Direct Attribution: Works with additive contributions from linear scaling (v1.0.19+)
Interactive Visualizations: Plotly-based plots with hover details
Performance Metrics: R², slope, and saturation point for each channel
Backward Compatibility: Maintains support for legacy method names
ResponseCurveFit Class
- class deepcausalmmm.postprocess.response_curves.ResponseCurveFit(data: DataFrame, *, bottom_param: bool = False, model_level: Literal['Overall', 'DMA'] = 'Overall', date_col: str = 'week_monday')[source]
Bases:
objectFit Hill equation response curves to marketing mix model predictions.
The Hill equation models saturation effects: y = bottom + (top - bottom) * x^slope / (saturation^slope + x^slope)
- Parameters:
data (pd.DataFrame) – DataFrame with columns: ‘week_monday’, ‘spend’, ‘impressions’, ‘predicted’ For DMA-level: also needs ‘dmacode’ column
bottom_param (bool, default=False) – Whether to fit a non-zero intercept (bottom parameter) For MMM, typically False (response at zero spend = 0)
Modellevel (str, default='Overall') – ‘Overall’: Single aggregated curve across all regions ‘DMA’: Separate curves for each DMA
Datecol (str, default='week_monday') – Name of the date column
- figure
Plotly figure object (if generate_figure=True)
- Type:
go.Figure
Examples
>>> # Prepare data >>> data = pd.DataFrame({ ... 'week_monday': dates, ... 'spend': spend_values, ... 'impressions': impression_values, ... 'predicted': model_predictions ... }) >>> >>> # Fit overall response curve >>> fitter = ResponseCurveFit(data, Modellevel='Overall') >>> fitter.fit_model( ... title="Response Curve", ... x_label="Impressions", ... y_label="Predicted Visits", ... generate_figure=True, ... save_figure=True, ... output_path='response_curve.html' ... ) >>> print(f"R²: {fitter.r_2:.3f}") >>> print(f"Slope: {fitter.slope:.3f}")
- __init__(data: DataFrame, *, bottom_param: bool = False, model_level: Literal['Overall', 'DMA'] = 'Overall', date_col: str = 'week_monday') None[source]
Initialize ResponseCurveFit.
- regression(x_fit, y_fit, x_label, y_label, title, sigfigs, log_x, print_r_sqr, generate_figure, view_figure, *params) None[source]
Backward compatibility wrapper for _calculate_r2_and_plot.
- fit(*, x_label: str = 'x', y_label: str = 'y', title: str = 'Fitted Hill equation', sigfigs: int = 6, log_x: bool = False, print_r_sqr: bool = True, generate_figure: bool = True, view_figure: bool = False, save_figure: bool = False, output_path: str | None = None, curve_fit_kws: dict | None = None) DataFrame | None[source]
Fit Hill equation to the data.
- Parameters:
x_label (str, default='x') – X-axis label
y_label (str, default='y') – Y-axis label
title (str, default='Fitted Hill equation') – Plot title
sigfigs (int, default=6) – Significant figures for equation display
log_x (bool, default=False) – Whether to use log scale for x-axis
print_r_sqr (bool, default=True) – Whether to print R² score
generate_figure (bool, default=True) – Whether to generate visualization
view_figure (bool, default=False) – Whether to display the figure
save_figure (bool, default=False) – Whether to save the figure
output_path (str, optional) – Path to save the figure (if save_figure=True)
curve_fit_kws (dict, optional) – Additional keyword arguments for scipy.optimize.curve_fit
- Returns:
For DMA-level: DataFrame with parameters for each DMA For Overall: None (parameters stored as attributes)
- Return type:
pd.DataFrame or None
Basic Usage
Fitting Response Curves
from deepcausalmmm.postprocess import ResponseCurveFit
import pandas as pd
# Prepare your data
channel_data = pd.DataFrame({
'week': [1, 2, 3, ...],
'impressions': [10000, 15000, 20000, ...],
'contributions': [500000, 650000, 750000, ...]
})
# Initialize fitter
fitter = ResponseCurveFit(
data=channel_data,
x_col='impressions',
y_col='contributions',
model_level='national',
date_col='week'
)
# Fit the curve
slope, saturation = fitter.fit_curve()
print(f"Slope (a): {slope:.3f}")
print(f"Half-Saturation Point (g): {saturation:.0f}")
Calculating R² and Plotting
# Calculate R² and generate interactive plot
r2_score = fitter.calculate_r2_and_plot(
save_path='response_curve_channel.html'
)
print(f"R² Score: {r2_score:.3f}")
Complete Workflow
from deepcausalmmm.postprocess import ResponseCurveFit
import pandas as pd
# Load channel data (impressions and contributions)
df = pd.read_csv('channel_data.csv')
# Initialize and fit
fitter = ResponseCurveFit(
data=df,
x_col='impressions',
y_col='contributions',
model_level='national',
date_col='week'
)
# Get fitted parameters
slope, saturation = fitter.fit_curve()
# Generate plot and get R²
r2 = fitter.calculate_r2_and_plot(save_path='curve.html')
# Interpret results
print(f"Channel Saturation Analysis:")
print(f" Slope (a): {slope:.3f}")
print(f" Half-Saturation (g): {saturation:,.0f} impressions")
print(f" Fit Quality (R²): {r2:.3f}")
if slope >= 2.0:
print(" Strong S-shaped curve (diminishing returns)")
else:
print(" Gentle curve (less pronounced saturation)")
if r2 >= 0.8:
print(" Excellent fit")
elif r2 >= 0.6:
print(" Good fit")
else:
print(" Moderate fit - review data quality")
Advanced Usage
Batch Processing Multiple Channels
import pandas as pd
from deepcausalmmm.postprocess import ResponseCurveFit
# Assume you have a DataFrame with multiple channels
all_channels_data = pd.read_csv('all_channels.csv')
results = []
for channel in all_channels_data['channel'].unique():
# Filter data for this channel
channel_df = all_channels_data[
all_channels_data['channel'] == channel
].copy()
# Fit response curve
fitter = ResponseCurveFit(
data=channel_df,
x_col='impressions',
y_col='contributions',
model_level='national',
date_col='week'
)
slope, saturation = fitter.fit_curve()
r2 = fitter.calculate_r2_and_plot(
save_path=f'curves/{channel}_response_curve.html'
)
results.append({
'channel': channel,
'slope': slope,
'saturation': saturation,
'r2': r2
})
# Create summary DataFrame
summary = pd.DataFrame(results)
summary = summary.sort_values('r2', ascending=False)
print(summary)
Interpreting Results
Slope (a) Parameter:
a >= 3.0: Very strong S-curve, rapid saturation2.0 <= a < 3.0: Strong S-curve, clear diminishing returns1.0 <= a < 2.0: Gentle curve, gradual saturationa < 1.0: Very gentle, almost linear
Half-Saturation Point (g):
The impression/spend level where the channel reaches 50% of maximum effect
Lower values indicate faster saturation
Compare across channels to identify efficiency
R² Score:
R² >= 0.8: Excellent fit, high confidence0.6 <= R² < 0.8: Good fit, reasonable confidence0.4 <= R² < 0.6: Moderate fit, review dataR² < 0.4: Poor fit, investigate data quality or model assumptions
Hill Equation
The response curve uses the Hill equation:
Where:
x: Input variable (impressions or spend)y: Output variable (contributions or response)a: Slope parameter (controls steepness of S-curve)g: Half-saturation point (x value where y = 0.5)
Properties:
Monotonic: Always increasing
Bounded: Output between 0 and 1 (when normalized)
S-Shaped: When
a >= 2.0Half-Saturation:
y(g) = 0.5
Technical Details
Fitting Algorithm
The module uses scipy.optimize.curve_fit with:
Initial Guess:
a=1,g=median(x)Bounds:
a ∈ [0.01, 100],g ∈ [0.01, max(x) × 10]Method: Trust Region Reflective (default)
Max Iterations: 10,000
Data Preprocessing
Aggregation: Groups by date column and sums x and y
Sorting: Sorts by x values for consistent fitting
Normalization: Internally normalizes y for numerical stability
Scaling: Scales fitted curve back to original y scale
Backward Compatibility
Legacy Method Names
For backward compatibility, the following legacy method names are supported:
fitter = ResponseCurveFit(data=df, x_col='x', y_col='y')
# New API (recommended)
result = fitter._hill_equation(x, a, g)
slope, sat = fitter.fit_curve()
r2 = fitter.calculate_r2_and_plot()
# Legacy API (still works)
result = fitter.Hill(x, a, g)
slope, sat = fitter.get_param()
r2 = fitter.regression()
slope, sat = fitter.fit_model() # Alias for fit_curve
Legacy Parameter Names
# New API (recommended)
fitter = ResponseCurveFit(
data=df,
x_col='impressions',
y_col='contributions',
model_level='national',
date_col='week'
)
# Legacy API (still works)
fitter = ResponseCurveFit(
data=df,
x_col='impressions',
y_col='contributions',
Modellevel='national', # Old name
Datecol='week' # Old name
)
ResponseCurveFitter Alias
The original class name ResponseCurveFitter is maintained as an alias:
from deepcausalmmm.postprocess import ResponseCurveFitter
# This works identically to ResponseCurveFit
fitter = ResponseCurveFitter(data=df, x_col='x', y_col='y')
Best Practices
Data Quality
Sufficient Points: Use at least 20-30 data points for reliable fitting
Range Coverage: Ensure data covers a wide range of x values
Outlier Handling: Remove or investigate extreme outliers before fitting
Monotonicity: Response curves assume y generally increases with x
Aggregation Level
National Weekly: Recommended for most analyses (reduces noise)
Regional: Use when analyzing regional differences
DMA-Level: Use with caution (high variance)
Model Validation
R² Threshold: Aim for R² >= 0.6 for reliable insights
Visual Inspection: Always review the generated plots
Business Logic: Ensure fitted parameters make business sense
Cross-Validation: Test on holdout periods when possible
Common Issues
Poor Fit (Low R²):
Check for outliers or data quality issues
Verify monotonic relationship between x and y
Consider if Hill equation is appropriate for your data
Try different aggregation levels
Unrealistic Parameters:
Very high slope (a > 10): May indicate overfitting
Very high saturation (g >> max(x)): Channel not reaching saturation
Very low saturation (g << median(x)): Most data in saturated region
Convergence Issues:
Increase max iterations
Try different initial guesses
Check for numerical issues (very large/small values)
Normalize your data before fitting
Examples
See the examples/ directory for complete examples:
example_response_curves.py: Full workflow with DeepCausalMMM integrationdashboard_rmse_optimized.py: Dashboard with integrated response curves
See Also
Analysis and Postprocessing: General analysis utilities
Core Model Components: Core model components
Tutorials: Step-by-step tutorials
Examples: Practical examples