deepcausalmmm.postprocess
Post-processing utilities for DeepCausalMMM analysis and visualization.
Functions
|
Create a ComprehensiveAnalyzer configured for unified pipeline. |
- class deepcausalmmm.postprocess.ComprehensiveAnalyzer(model, media_cols: List[str], control_cols: List[str], output_dir: str = 'mmm_analysis_results', pipeline=None, auto_detect_burnin: bool = True, manual_burnin_weeks: int | None = None, config: Dict | None = None, inference: InferenceManager | None = None)[source]
Modernized comprehensive analyzer for DeepCausalMMM with config-driven visualizations.
- __init__(model, media_cols: List[str], control_cols: List[str], output_dir: str = 'mmm_analysis_results', pipeline=None, auto_detect_burnin: bool = True, manual_burnin_weeks: int | None = None, config: Dict | None = None, inference: InferenceManager | None = None)[source]
Initialize the comprehensive analyzer.
- Parameters:
model – Trained DeepCausalMMM model
media_cols – List of media column names
control_cols – List of control column names
output_dir – Directory to save outputs
pipeline – UnifiedDataPipeline instance for modern data processing
auto_detect_burnin – Whether to automatically detect burn-in weeks from model
manual_burnin_weeks – Manually specify burn-in weeks (overrides auto-detection)
config – Configuration dictionary (uses default if None)
inference – Modern InferenceManager instance
- inverse_transform_target(y_scaled: ndarray) ndarray[source]
Apply inverse transformation to target variable using modern pipeline.
- Parameters:
y_scaled – Scaled target values
- Returns:
Unscaled target values
- inverse_transform_contributions(contributions_scaled: ndarray, y_original: ndarray) ndarray[source]
Apply inverse transformation to contributions using modern pipeline.
- Parameters:
contributions_scaled – Scaled contributions
y_original – Original scale target values
- Returns:
Contributions in original scale
- analyze_with_unified_pipeline(X_media: ndarray, X_control: ndarray, y_true: ndarray, create_plots: bool = True) Dict[str, Any][source]
Perform comprehensive analysis using the unified pipeline.
- Parameters:
X_media – Media data (full dataset)
X_control – Control data (full dataset)
y_true – True target values (full dataset)
create_plots – Whether to create visualization plots
- Returns:
Dictionary with analysis results
- analyze_comprehensive(X_media: ndarray, X_control: ndarray, y_true: ndarray, region_ids: ndarray, weeks: List[int] | None = None) Dict[str, Any][source]
Run comprehensive analysis with all visualizations. Automatically removes burn-in/padding from all outputs.
- Parameters:
X_media – Media variables [n_regions, n_weeks, n_channels] (may include padding)
X_control – Control variables [n_regions, n_weeks, n_controls] (may include padding)
y_true – True target values (scaled, may include padding)
region_ids – Region identifiers
weeks – Week labels (optional)
- Returns:
Dictionary containing all analysis results (burn-in removed)
- class deepcausalmmm.postprocess.ModelAnalyzer(inference: InferenceManager | None = None, legacy_inference: ModelInference | None = None, scaler: SimpleGlobalScaler | None = None, pipeline=None, config: Dict | None = None, output_dir: str | None = None)[source]
Analyze and visualize DeepCausalMMM model results with modern class-based architecture.
- __init__(inference: InferenceManager | None = None, legacy_inference: ModelInference | None = None, scaler: SimpleGlobalScaler | None = None, pipeline=None, config: Dict | None = None, output_dir: str | None = None)[source]
Initialize the enhanced analyzer with unified pipeline support.
- Parameters:
inference – Modern InferenceManager instance (preferred)
legacy_inference – Legacy ModelInference instance (for compatibility)
scaler – SimpleGlobalScaler for proper inverse transformations (legacy)
pipeline – UnifiedDataPipeline instance (preferred)
config – Model configuration dictionary
output_dir – Directory to save outputs
- analyze_with_unified_pipeline(model, X_media: ndarray, X_control: ndarray, y_true: ndarray, channel_names: List[str], control_names: List[str]) Dict[str, Any][source]
Analyze model results using the unified pipeline.
- Parameters:
model – Trained model
X_media – Media data
X_control – Control data
y_true – True target values
channel_names – Media channel names
control_names – Control variable names
- Returns:
Analysis results dictionary
- analyze_predictions(X_m: ndarray, X_c: ndarray, R: ndarray, y_true: ndarray | None = None, generate_plots: bool = True) Dict[str, Any][source]
Analyze model predictions and generate visualizations.
- Parameters:
X_m – Media variables [n_regions, n_weeks, n_channels]
X_c – Control variables [n_regions, n_weeks, n_controls]
R – Region indices [n_regions] (reserved;
InferenceManagercurrently builds region indices internally)y_true – Optional ground truth values
generate_plots – Whether to generate and save plots
- Returns:
Dictionary containing analysis results and metrics
- static plot_coefficients_over_time(coefficients: ndarray, channel_names: List[str]) Figure[source]
Plot mean coefficients over time for each channel.
- static plot_contribution_comparison(predictions: ndarray, actuals: ndarray | None, burn_in_weeks: int) Figure[source]
Plot actual vs predicted revenue comparison.
- class deepcausalmmm.postprocess.ResponseCurveFit(data: DataFrame, *, bottom_param: bool = False, model_level: Literal['Overall', 'DMA'] = 'Overall', date_col: str = 'week_monday')[source]
Fit Hill equation response curves to marketing mix model predictions.
The Hill equation models saturation effects: y = bottom + (top - bottom) * x^slope / (saturation^slope + x^slope)
- Parameters:
data (pd.DataFrame) – DataFrame with columns: ‘week_monday’, ‘spend’, ‘impressions’, ‘predicted’ For DMA-level: also needs ‘dmacode’ column
bottom_param (bool, default=False) – Whether to fit a non-zero intercept (bottom parameter) For MMM, typically False (response at zero spend = 0)
Modellevel (str, default='Overall') – ‘Overall’: Single aggregated curve across all regions ‘DMA’: Separate curves for each DMA
Datecol (str, default='week_monday') – Name of the date column
- figure
Plotly figure object (if generate_figure=True)
- Type:
go.Figure
Examples
>>> # Prepare data >>> data = pd.DataFrame({ ... 'week_monday': dates, ... 'spend': spend_values, ... 'impressions': impression_values, ... 'predicted': model_predictions ... }) >>> >>> # Fit overall response curve >>> fitter = ResponseCurveFit(data, Modellevel='Overall') >>> fitter.fit_model( ... title="Response Curve", ... x_label="Impressions", ... y_label="Predicted Visits", ... generate_figure=True, ... save_figure=True, ... output_path='response_curve.html' ... ) >>> print(f"R²: {fitter.r_2:.3f}") >>> print(f"Slope: {fitter.slope:.3f}")
- __init__(data: DataFrame, *, bottom_param: bool = False, model_level: Literal['Overall', 'DMA'] = 'Overall', date_col: str = 'week_monday') None[source]
Initialize ResponseCurveFit.
- regression(x_fit, y_fit, x_label, y_label, title, sigfigs, log_x, print_r_sqr, generate_figure, view_figure, *params) None[source]
Backward compatibility wrapper for _calculate_r2_and_plot.
- fit(*, x_label: str = 'x', y_label: str = 'y', title: str = 'Fitted Hill equation', sigfigs: int = 6, log_x: bool = False, print_r_sqr: bool = True, generate_figure: bool = True, view_figure: bool = False, save_figure: bool = False, output_path: str | None = None, curve_fit_kws: dict | None = None) DataFrame | None[source]
Fit Hill equation to the data.
- Parameters:
x_label (str, default='x') – X-axis label
y_label (str, default='y') – Y-axis label
title (str, default='Fitted Hill equation') – Plot title
sigfigs (int, default=6) – Significant figures for equation display
log_x (bool, default=False) – Whether to use log scale for x-axis
print_r_sqr (bool, default=True) – Whether to print R² score
generate_figure (bool, default=True) – Whether to generate visualization
view_figure (bool, default=False) – Whether to display the figure
save_figure (bool, default=False) – Whether to save the figure
output_path (str, optional) – Path to save the figure (if save_figure=True)
curve_fit_kws (dict, optional) – Additional keyword arguments for scipy.optimize.curve_fit
- Returns:
For DMA-level: DataFrame with parameters for each DMA For Overall: None (parameters stored as attributes)
- Return type:
pd.DataFrame or None
- deepcausalmmm.postprocess.ResponseCurveFitter
alias of
ResponseCurveFit
- deepcausalmmm.postprocess.create_unified_analyzer(model, pipeline, media_cols: list, control_cols: list, output_dir: str = 'unified_analysis_results') ComprehensiveAnalyzer[source]
Create a ComprehensiveAnalyzer configured for unified pipeline.
- Parameters:
model – Trained DeepCausalMMM model
pipeline – UnifiedDataPipeline instance
media_cols – Media column names
control_cols – Control column names
output_dir – Output directory
- Returns:
Configured ComprehensiveAnalyzer
- class deepcausalmmm.postprocess.BudgetOptimizer(budget: float, channels: List[str], response_curves: Dict[str, Dict], *, num_weeks: int = 52, method: str = 'trust-constr')[source]
Optimize marketing budget allocation using response curves.
Uses constrained optimization (trust-constr, SLSQP, or differential evolution) with Hill transformation curves from ResponseCurveFit to find optimal spend allocation that maximizes total response subject to business constraints.
- Parameters:
budget (float) – Total budget to allocate across all channels
channels (List[str]) – List of channel names to include in optimization
response_curves (Dict[str, Dict]) – Response curve parameters by channel from ResponseCurveFit. Each channel dict should contain: ‘top’, ‘bottom’, ‘saturation’, ‘slope’
num_weeks (int, default=52) – Number of weeks for planning horizon (annual by default)
method (str, default='trust-constr') – Optimization method: ‘trust-constr’, ‘SLSQP’, ‘differential_evolution’, ‘hybrid’
- constraints_df
DataFrame with channel-level constraints (lower, upper bounds)
- Type:
pd.DataFrame or None
Examples
>>> # After fitting response curves with ResponseCurveFit >>> curves = { ... 'TV': {'top': 1000000, 'bottom': 0, 'saturation': 50000, 'slope': 1.5}, ... 'Search': {'top': 800000, 'bottom': 0, 'saturation': 30000, 'slope': 2.0}, ... 'Social': {'top': 600000, 'bottom': 0, 'saturation': 20000, 'slope': 1.8} ... } >>> >>> optimizer = BudgetOptimizer( ... budget=1000000, ... channels=['TV', 'Search', 'Social'], ... response_curves=curves, ... num_weeks=52 ... ) >>> >>> # Optional: Set channel-specific constraints >>> optimizer.set_constraints({ ... 'TV': {'lower': 50000, 'upper': 500000}, ... 'Search': {'lower': 100000, 'upper': 400000} ... }) >>> >>> # Run optimization >>> result = optimizer.optimize() >>> >>> # View results >>> if result.success: ... print("Optimal Allocation:") ... for channel, spend in result.allocation.items(): ... print(f" {channel}: ${spend:,.0f}") ... print(f"\nTotal Response: {result.predicted_response:,.0f}") ... print(f"\nDetailed Results:\n{result.by_channel}")
Notes
The optimizer maximizes total response using the Hill equation:
\[response = bottom + (top - bottom) * \frac{spend^{slope}}{saturation^{slope} + spend^{slope}}\]Where: - top: Maximum response (saturation level) - bottom: Minimum response (typically 0) - saturation: Spend level at half-maximum response - slope: Steepness of the response curve
The optimization problem is:
\[ \begin{align}\begin{aligned}\max_{x_1, ..., x_n} \sum_{i=1}^{n} response_i(x_i)\\s.t. \sum_{i=1}^{n} x_i = budget\\lower_i \leq x_i \leq upper_i \quad \forall i\end{aligned}\end{align} \]- __init__(budget: float, channels: List[str], response_curves: Dict[str, Dict], *, num_weeks: int = 52, method: str = 'trust-constr')[source]
Initialize BudgetOptimizer with budget, channels, and response curves.
- set_constraints(constraints: Dict[str, Dict[str, float]]) None[source]
Set spend constraints for channels.
- Parameters:
constraints (Dict[str, Dict[str, float]]) – Channel constraints: {‘channel’: {‘lower’: min_spend, ‘upper’: max_spend}}
Examples
>>> optimizer.set_constraints({ ... 'TV': {'lower': 50000, 'upper': 500000}, ... 'Search': {'lower': 100000, 'upper': 400000}, ... 'Social': {'lower': 25000, 'upper': 300000} ... })
Notes
Channels not specified in constraints get default bounds: [0, budget]
Upper bounds are automatically capped at total budget
If lower > upper, lower is reset to 0
Upper bounds cannot be 0 (would make channel unusable)
- optimize() OptimizationResult[source]
Run optimization to find optimal budget allocation.
- Returns:
Optimization results including allocation, predicted response, and details
- Return type:
Examples
>>> result = optimizer.optimize() >>> if result.success: ... print("Optimization successful!") ... print(f"Predicted response: {result.predicted_response:,.0f}") ... for channel, spend in result.allocation.items(): ... roi = result.by_channel[result.by_channel['channel']==channel]['roi'].iloc[0] ... print(f"{channel}: ${spend:,.0f} (ROI: {roi:.2f})") ... else: ... print(f"Optimization failed: {result.message}")
- compare_scenarios(scenarios: Dict[str, Dict[str, float]]) DataFrame[source]
Compare different budget allocation scenarios.
- Parameters:
scenarios (Dict[str, Dict[str, float]]) – Dictionary of scenarios: {‘scenario_name’: {‘channel’: spend, …}}
- Returns:
Comparison of scenarios with predicted responses and ROIs
- Return type:
pd.DataFrame
Examples
>>> scenarios = { ... 'Current': {'TV': 400000, 'Search': 350000, 'Social': 250000}, ... 'Optimized': result.allocation, ... 'Heavy TV': {'TV': 600000, 'Search': 250000, 'Social': 150000} ... } >>> comparison = optimizer.compare_scenarios(scenarios) >>> print(comparison)
- class deepcausalmmm.postprocess.OptimizationResult(success: bool, allocation: Dict[str, float], predicted_response: float, by_channel: DataFrame, message: str = '', method: str = 'trust-constr')[source]
Result from budget optimization.
- by_channel
Detailed results by channel with spend, response, and ROI
- Type:
pd.DataFrame
Examples
>>> result = optimizer.optimize() >>> if result.success: ... print(f"Optimal allocation: {result.allocation}") ... print(f"Expected response: {result.predicted_response:,.0f}") ... print(result.by_channel)
- deepcausalmmm.postprocess.optimize_budget_from_curves(budget: float, curve_params: DataFrame, *, channel_col: str = 'channel', num_weeks: int = 52, constraints: Dict[str, Dict[str, float]] | None = None, method: str = 'trust-constr') OptimizationResult[source]
Convenience function to optimize budget directly from curve parameters DataFrame.
This function is useful when you have response curve parameters in a DataFrame (e.g., from ResponseCurveFit fitted on multiple channels) and want to quickly run optimization without manually setting up the BudgetOptimizer.
- Parameters:
budget (float) – Total budget to allocate
curve_params (pd.DataFrame) – DataFrame with response curve parameters. Required columns: channel, top, bottom, saturation, slope
channel_col (str, default='channel') – Name of the channel column in curve_params
num_weeks (int, default=52) – Number of weeks for planning horizon
constraints (Dict[str, Dict[str, float]], optional) – Channel-specific constraints: {‘channel’: {‘lower’: min, ‘upper’: max}}
method (str, default='trust-constr') – Optimization method
- Returns:
Optimization results
- Return type:
Examples
>>> # After fitting curves for multiple channels >>> curves_df = pd.DataFrame({ ... 'channel': ['TV', 'Search', 'Social'], ... 'top': [1000000, 800000, 600000], ... 'bottom': [0, 0, 0], ... 'saturation': [50000, 30000, 20000], ... 'slope': [1.5, 2.0, 1.8] ... }) >>> >>> result = optimize_budget_from_curves( ... budget=1000000, ... curve_params=curves_df, ... constraints={'TV': {'lower': 100000, 'upper': 600000}} ... ) >>> print(result.allocation)
- deepcausalmmm.postprocess.prepare_optimization_data(contributions_df: DataFrame, media_data: DataFrame, *, date_col: str = 'week_monday', channel_col: str = 'channel', contribution_col: str = 'predicted', spend_col: str = 'spend', impressions_col: str = 'impressions') DataFrame[source]
Prepare data from DeepCausalMMM outputs for response curve fitting and optimization.
This function merges model contribution predictions with media spend/impression data to create the required format for ResponseCurveFit.
- Parameters:
contributions_df (pd.DataFrame) – Model contributions output with columns: date, channel, predicted
media_data (pd.DataFrame) – Media data with columns: date, channel, spend, impressions
date_col (str, default='week_monday') – Name of the date column
channel_col (str, default='channel') – Name of the channel column
contribution_col (str, default='predicted') – Name of the contribution/prediction column
spend_col (str, default='spend') – Name of the spend column
impressions_col (str, default='impressions') – Name of the impressions column
- Returns:
Merged data ready for ResponseCurveFit with columns: week_monday, channel, spend, impressions, predicted
- Return type:
pd.DataFrame
Examples
>>> # After training DeepCausalMMM model >>> contributions = model.get_contributions() # Your model output >>> media_df = pd.read_csv('media_data.csv') >>> >>> optimization_data = prepare_optimization_data( ... contributions_df=contributions, ... media_data=media_df ... )
- deepcausalmmm.postprocess.fit_response_curves_batch(data: DataFrame, channels: List[str] | None = None, *, bottom_param: bool = False, model_level: str = 'Overall', date_col: str = 'week_monday', generate_figures: bool = False, save_figures: bool = False, output_dir: str | None = None) Tuple[Dict[str, Dict], DataFrame][source]
Fit response curves for multiple channels in batch.
This is a convenience wrapper around ResponseCurveFit that processes multiple channels and returns both dictionary and DataFrame formats.
- Parameters:
data (pd.DataFrame) – Data prepared by prepare_optimization_data() with columns: week_monday, channel, spend, impressions, predicted
channels (List[str], optional) – List of channels to fit. If None, fits all channels in data
bottom_param (bool, default=False) – Whether to fit non-zero intercept
model_level (str, default='Overall') – Aggregation level: ‘Overall’ or ‘DMA’
date_col (str, default='week_monday') – Name of date column
generate_figures (bool, default=False) – Whether to generate plots
save_figures (bool, default=False) – Whether to save plots to files
output_dir (str, optional) – Directory to save plots (required if save_figures=True)
- Returns:
curves_dict (Dict[str, Dict]) – Response curve parameters by channel
curves_df (pd.DataFrame) – Response curve parameters as DataFrame
Examples
>>> # After preparing data >>> curves_dict, curves_df = fit_response_curves_batch( ... data=optimization_data, ... channels=['TV', 'Search', 'Social'], ... generate_figures=True, ... save_figures=True, ... output_dir='./response_curves/' ... ) >>> print(curves_df)
- deepcausalmmm.postprocess.create_optimizer_from_model_output(contributions_df: DataFrame, media_data: DataFrame, budget: float, *, channels: List[str] | None = None, num_weeks: int = 52, constraints: Dict[str, Dict[str, float]] | None = None, method: str = 'trust-constr', generate_figures: bool = False, save_figures: bool = False, output_dir: str | None = None) Tuple[BudgetOptimizer, DataFrame][source]
End-to-end: Create optimizer from DeepCausalMMM model outputs.
This function handles the complete workflow: 1. Prepare data from model outputs 2. Fit response curves for all channels 3. Create and configure BudgetOptimizer
- Parameters:
contributions_df (pd.DataFrame) – Model contribution predictions
media_data (pd.DataFrame) – Media spend and impression data
budget (float) – Total budget to optimize
channels (List[str], optional) – Channels to include. If None, uses all channels
num_weeks (int, default=52) – Planning horizon in weeks
constraints (Dict[str, Dict[str, float]], optional) – Channel spend constraints
method (str, default='trust-constr') – Optimization method
generate_figures (bool, default=False) – Whether to generate response curve plots
save_figures (bool, default=False) – Whether to save plots
output_dir (str, optional) – Directory for plots
- Returns:
optimizer (BudgetOptimizer) – Configured optimizer ready to run
curves_df (pd.DataFrame) – Response curve parameters
Examples
>>> # Complete workflow from model outputs to optimizer >>> optimizer, curves = create_optimizer_from_model_output( ... contributions_df=model_contributions, ... media_data=media_df, ... budget=1000000, ... constraints={'TV': {'lower': 100000, 'upper': 600000}}, ... generate_figures=True, ... save_figures=True, ... output_dir='./optimization_results/' ... ) >>> >>> # Run optimization >>> result = optimizer.optimize() >>> print(result.allocation)
- deepcausalmmm.postprocess.compare_current_vs_optimal(current_allocation: Dict[str, float], optimal_result: OptimizationResult, *, metric_name: str = 'Response') DataFrame[source]
Compare current budget allocation vs optimized allocation.
- Parameters:
current_allocation (Dict[str, float]) – Current spend by channel
optimal_result (OptimizationResult) – Result from optimizer.optimize()
metric_name (str, default='Response') – Name of the metric being optimized
- Returns:
Comparison table with current, optimal, and deltas
- Return type:
pd.DataFrame
Examples
>>> current = {'TV': 400000, 'Search': 350000, 'Social': 250000} >>> result = optimizer.optimize() >>> >>> comparison = compare_current_vs_optimal(current, result) >>> print(comparison)
- deepcausalmmm.postprocess.generate_optimization_report(result: OptimizationResult, curves_df: DataFrame, current_allocation: Dict[str, float] | None = None, *, output_path: str | None = None) str[source]
Generate a comprehensive text report of optimization results.
- Parameters:
result (OptimizationResult) – Optimization result
curves_df (pd.DataFrame) – Response curve parameters
current_allocation (Dict[str, float], optional) – Current allocation for comparison
output_path (str, optional) – Path to save report (if not provided, returns as string)
- Returns:
Formatted report text
- Return type:
Examples
>>> report = generate_optimization_report( ... result=result, ... curves_df=curves, ... current_allocation={'TV': 400000, 'Search': 350000, 'Social': 250000}, ... output_path='optimization_report.txt' ... ) >>> print(report)
Modules
Post-processing utilities for analyzing and visualizing DeepCausalMMM results. |
|
Comprehensive post-processing analysis for DeepCausalMMM with inverse transformation. |
|
Post-processing utilities for DAG visualization and analysis. |
|
Budget Optimization for Marketing Mix Modeling. |
|
Utility functions for budget optimization with DeepCausalMMM. |
|
Response curve fitting for Marketing Mix Modeling using Hill equation. |