21.10 Types of Treatment Effects

When evaluating the causal impact of an intervention, different estimands (quantities of interest) can be used to measure treatment effects, depending on the study design and assumptions about compliance.

Terminology:

  • Estimands: The causal effect parameters we seek to measure.
  • Estimators: The statistical procedures used to estimate those parameters.
  • Sources of Bias (Keele and Grieve 2025):

Estimator - True Causal Effect=Hidden biasDue to design+Misspecification biasDue to modeling+Statistical noiseDue to finite sample

  1. Hidden Bias (Due to Design)
  • Arises from unobserved confounders and measurement error that remain after conditioning on observed covariates.
  • Is “hidden” because its true magnitude or direction cannot be directly observed.
  • Violations of conditional exchangeability (also called no unobserved confounding) imply the presence of hidden bias.
  1. Misspecification Bias (Due to Modeling)
  • Occurs when the assumed model for the outcome or treatment assignment does not reflect the true data-generating process.
  • Persists even if we have perfect exchangeability (i.e., no hidden bias).
  • Can be viewed as under-specification (omitting essential terms or functional forms) or over-specification (including unnecessary parameters).
  1. Statistical Noise (Due to Finite Sample)
  • Even with perfect design and correct model specification, finite samples lead to randomness in estimates.
  • Standard errors, confidence intervals, and p-values reflect this uncertainty.

In practice, all three sources of bias and uncertainty can coexist to varying degrees.


21.10.1 Average Treatment Effect

The Average Treatment Effect (ATE) is the expected difference in outcomes between individuals who receive treatment and those who do not.

Definition

Let:

  • Yi(1) be the outcome of individual i under treatment.

  • Yi(0) be the outcome of individual i under control.

The individual treatment effect is:

τi=Yi(1)Yi(0)

Since we cannot observe both Yi(1) and Yi(0) for the same individual (a fundamental problem in causal inference), we estimate the ATE across a population:

ATE=E[Y(1)]E[Y(0)]

Identification Under Randomization

If treatment assignment is randomized (under Experimental Design), then the observed difference in means between treatment and control groups provides an unbiased estimator of ATE:

ATE=1NNi=1τi=N1Yi(1)NNiYi(0)N

With randomization, we assume:

E[Y(1)|D=1]=E[Y(1)|D=0]=E[Y(1)]

E[Y(0)|D=1]=E[Y(0)|D=0]=E[Y(0)]

Thus, the difference in observed means between treated and control groups provides an unbiased estimate of ATE.

ATE=E[Y(1)]E[Y(0)]


Alternatively, we can express the potential outcomes framework in a regression form, which allows us to connect causal inference concepts with standard regression analysis.

Instead of writing treatment effects as potential outcomes, we can define the observed outcome Yi in terms of a regression equation:

Yi=Yi(0)+[Yi(1)Yi(0)]Di

where:

  • Yi(0) is the outcome if individual i does not receive treatment.

  • Yi(1) is the outcome if individual i does receive treatment.

  • Di is a binary indicator for treatment assignment:

  • Di=1 if individual i receives treatment.

  • Di=0 if individual i is in the control group.

We can redefine this equation using regression notation:

Yi=β0i+β1iDi

where:

  • β0i=Yi(0) represents the baseline (control group) outcome.

  • β1i=Yi(1)Yi(0) represents the individual treatment effect.

Thus, in an ideal setting, the coefficient on Di in a regression gives us the treatment effect.


In observational studies, treatment assignment Di is often not random, leading to endogeneity. This means that the error term in the regression equation might be correlated with Di, violating one of the key assumptions of the Ordinary Least Squares estimator.

To formalize this issue, we can express the outcome equation as:

Yi=β0i+β1iDi=(ˉβ0+ϵ0i)+(ˉβ1+ϵ1i)Di=ˉβ0+ϵ0i+ˉβ1Di+ϵ1iDi

where:

  • ˉβ0 is the average baseline outcome.

  • ˉβ1 is the average treatment effect.

  • ϵ0i captures individual-specific deviations in control group outcomes.

  • ϵ1i captures heterogeneous treatment effects.

If treatment assignment is truly random, then:

E[ϵ0i]=E[ϵ1i]=0

which ensures:

  • No selection bias: Diϵ0i (i.e., treatment assignment is independent of the baseline error).

  • Treatment effect is independent of assignment: Diϵ1i.

However, in observational studies, these assumptions often fail. This leads to:

  • Selection bias: If individuals self-select into treatment based on unobserved characteristics, then Di correlates with ϵ0i.
  • Heterogeneous treatment effects: If the treatment effect itself varies across individuals, then Di correlates with ϵ1i.

These issues violate the exogeneity assumption in OLS regression, leading to biased estimates of β1.


When estimating treatment effects using OLS regression, we need to be aware of potential estimation issues.

  1. OLS Estimator and Difference-in-Means

Under random assignment, the OLS estimator for β1 simplifies to the difference in means estimator:

ˆβOLS1=ˉYtreatedˉYcontrol

which is an unbiased estimator of the Average Treatment Effect.

However, when treatment assignment is not random, OLS estimates may be biased due to unobserved confounders.

  1. Heteroskedasticity and Robust Standard Errors

If treatment effects vary across individuals (i.e., treatment effect heterogeneity), the error term contains an interaction:

ϵi=ϵ0i+Diϵ1i

which leads to heteroskedasticity (i.e., the variance of errors depends on Di and possibly on covariates Xi).

To address this, we use heteroskedasticity-robust standard errors, which ensure valid inference even when variance is not constant across observations.


21.10.2 Conditional Average Treatment Effect

Treatment effects may vary across different subgroups in a population. The Conditional Average Treatment Effect (CATE) captures heterogeneity in treatment effects across subpopulations.

Definition

For a subgroup characterized by covariates Xi:

CATE=E[Y(1)Y(0)|Xi]

Why is CATE Useful?

  • Heterogeneous Treatment Effects: Certain groups may benefit more from treatment than others.
  • Policy Targeting: Understanding who benefits the most allows for better resource allocation.

Example

  • Policy Intervention: A job training program may have different effects on younger vs. older workers.
  • Medical Treatments: Drug effectiveness may differ by gender, age, or genetic factors.

Estimating CATE allows policymakers and researchers to identify who benefits most from an intervention.


21.10.3 Intention-to-Treat Effect

A key issue in empirical research is non-compliance, where individuals do not always follow their assigned treatment (i.e., either people who are supposed to receive treatment don’t receive it, or people who are supposed to be in the control group receive the treatment). The Intention-to-Treat (ITT) effect measures the impact of offering treatment, regardless of whether individuals actually receive it.

Definition

The ITT effect is the observed difference in means between groups assigned to treatment and control:

ITT=E[Y|D=1]E[Y|D=0]

Why Use ITT?

  • Policy Evaluation: ITT reflects the real-world effectiveness of an intervention, accounting for incomplete take-up.
  • Randomized Trials: ITT preserves randomization, even when compliance is imperfect.

Example: Vaccination

  • A government offers a vaccine (ITT), but not everyone actually takes it.
  • The true treatment effect depends on those who receive the vaccine, which differs from the effect measured under ITT.

Since non-compliance is common in real-world settings, ITT effects are often smaller than true treatment effects. In this case, the difference in observed means between the treatment and control groups is not [Average Treatment Effects], but Intention-to-Treat Effect.


21.10.4 Local Average Treatment Effects

In many empirical settings, not all individuals assigned to treatment actually receive it (non-compliance). Instead of estimating the treatment effect for everyone assigned to treatment (i.e., Intention-to-Treat Effects), we often want to estimate the effect of treatment on those who actually comply with their assignment.

This is known as the Local Average Treatment Effect, also referred to as the Complier Average Causal Effect (CACE).

  • LATE is the treatment effect for the subgroup of compliers—those who take the treatment if and only if assigned to it.
  • Unlike Conditional Average Treatment Effects, which describes heterogeneity across observable subgroups, LATE focuses on compliance behavior.
  • We typically recover LATE using Instrumental Variables, leveraging random treatment assignment as an instrument.

21.10.4.1 Estimating LATE Using Instrumental Variables

Instrumental variable estimation allows us to isolate the effect of treatment on compliers by using random treatment assignment as an instrument for actual treatment receipt.

From an instrumental variables perspective, LATE is estimated as:

LATE=ITTShare of Compliers

where:

  • ITT (Intention-to-Treat Effect) is the effect of being assigned to treatment.

  • Share of Compliers is the proportion of individuals who actually take the treatment when assigned to it.

21.10.4.2 Key Properties of LATE

  • As the proportion of compliers increases, LATE converges to ITT.
  • LATE is always larger than ITT, since ITT averages over both compliers and non-compliers.
  • Standard error rule of thumb:
    • The standard error of LATE is given by:

      SE(LATE)=SE(ITT)Share of Compliers

  • LATE can also be estimated using a pure placebo group (Gerber et al. 2010).
  • Partial compliance is difficult to study
  • The IV/2SLS estimator is biased in small samples, requiring Bayesian methods for correction (Long, Little, and Lin 2010; Jin and Rubin 2009, 2008).

21.10.4.3 One-Sided Noncompliance

One-sided noncompliance occurs when we observe only compliers and never-takers in the sample (i.e., no always-takers).

Key assumptions:

  • Exclusion Restriction (Excludability): Never-takers have the same outcomes regardless of assignment (i.e., treatment has no effect on them because they never receive it).

  • Random Assignment Ensures Balance: The number of never-takers is expected to be equal in the treatment and control groups.

Estimation of LATE under one-sided noncompliance:

LATE=ITTShare of Compliers

Since the never-takers do not receive treatment, this simplifies estimation.


21.10.4.4 Two-Sided Noncompliance

Two-sided noncompliance occurs when we observe compliers, never-takers, and always-takers in the sample.

Key assumptions:

  • Exclusion Restriction (Excludability): Never-takers and always-takers have the same outcome regardless of treatment assignment.

  • Monotonicity Assumption (No Defiers):

    • There are no defiers, meaning no individuals systematically avoid treatment when assigned to it.

    • This assumption is standard in practical studies.

Estimation of LATE under two-sided noncompliance:

LATE=ITTShare of Compliers

  • Since always-takers receive treatment regardless of assignment, their presence does not bias LATE as long as monotonicity holds.
  • In practice, monotonicity is often reasonable, as defiers are rare.
Scenario What it Measures When to Use It? Key Assumptions
Intention-to-Treat Effect of being assigned to treatment Policy impact with non-compliance None (preserves randomization)
LATE Effect on compliers only When we care about actual treatment effect rather than assignment Excludability, Monotonicity (No Defiers)

21.10.5 Population vs. Sample Average Treatment Effects

In experimental and observational studies, we often estimate the Sample Average Treatment Effect (SATE) using a finite sample. However, the Population Average Treatment Effect (PATE) is the parameter of interest when making broader generalizations.

Key Issue:
SATE does not necessarily equal PATE due to sample selection bias and treatment imbalance.

See (Imai, King, and Stuart 2008) for an in-depth discussion on when SATE diverges from PATE.


Consider a finite population of size N from which we observe a sample of size n (Nn). Half of the sample receives treatment, and half is assigned to control.

Define the following indicators:

  • Sampling Indicator:
    Ii={1,if unit i is in the sample0,otherwise
  • Treatment Assignment Indicator:
    Ti={1,if unit i is in the treatment group0,if unit i is in the control group
  • Potential Outcomes Framework:
    Yi={Yi(1),if Ti=1 (Treated)Yi(0),if Ti=0 (Control)
  • Observed Outcome:
    Since we can never observe both potential outcomes for the same unit, the observed outcome is:

    Yi|Ii=1=TiYi(1)+(1Ti)Yi(0)

  • True Individual Treatment Effect:
    The individual-level treatment effect is:

    TEi=Yi(1)Yi(0)

However, since we observe only one of Yi(1) or Yi(0), TEi is never directly observed.

21.10.5.1 Definitions of SATE and PATE

  • Sample Average Treatment Effect (SATE): SATE=1ni{Ii=1}TEi SATE is the average treatment effect within the sample.

  • Population Average Treatment Effect (PATE): PATE=1NNi=1TEi PATE represents the true treatment effect across the entire population.

Since we observe only a subset of the population, SATE may not equal PATE.


21.10.5.2 Decomposing Estimation Error

The baseline estimator for SATE and PATE is the difference in observed means:

D=1n/2i(Ii=1,Ti=1)Yi1n/2i(Ii=1,Ti=0)Yi=(Mean of Treated Group)(Mean of Control Group)

Define Δ as the estimation error (i.e., deviation from the truth), under an additive model:

Yi(t)=gt(Xi)+ht(Ui)

The estimation error is decomposed into

PATED=Δ=ΔS+ΔT=(PATESATE)+(SATED)=Sample Selection Bias+Treatment Imbalance=(ΔSX+ΔSU)+(ΔTX+ΔTU)=(Selection on Observables+Selection on Unobservables)+(Treatment Imbalance in Observables+Treatment Imbalance in Unobservables)

To further illustrate this, we begin by explicitly defining how the total discrepancy PATED separates into different components.

Step 1: From PATED to ΔS+ΔT

PATEDΔ=(PATESATE)ΔS+(SATED)ΔT.

  • PATED: The total discrepancy between the true population treatment effect and the estimate D.
  • ΔS=PATESATE: Sample Selection Bias – how much the sample ATE differs from the population ATE.
  • ΔT=SATED: Treatment Imbalance – how much the estimated treatment effect deviates from the sample ATE.

Step 2: Breaking Bias into Observables and Unobservables

Each bias term can be decomposed into observed (X) and unobserved (U) factors:

ΔS=ΔSXSelection on Observables+ΔSUSelection on Unobservables

ΔT=ΔTXTreatment Imbalance in Observables+ΔTUTreatment Imbalance in Unobservables

Thus, the final expression:

PATED=(PATESATE)ΔS:Sample Selection Bias+(SATED)ΔT:Treatment Imbalance=(ΔSX+ΔSU)Selection on X+ Selection on U+(ΔTX+ΔTU)Imbalance in X+ Imbalance in U.

This decomposition clarifies the sources of error in estimating the true effect, distinguishing between sample representativeness (selection bias) and treatment assignment differences (treatment imbalance), and further separating these into observable and unobservable components.

21.10.5.2.1 Sample Selection Bias ( ΔS )

Also called sample selection error, this arises when the sample is not representative of the population.

ΔS=PATESATE=NnN(NATESATE)

where:

  • NATE (Non-Sample Average Treatment Effect) is the average treatment effect for the part of the population not included in the sample:

NATE=i(Ii=0)TEiNn

To eliminate sample selection bias (ΔS=0):

  1. Redefine the sample as the entire population (N=n).
  2. Ensure NATE=SATE (e.g., treatment effects must be homogeneous across sampled and non-sampled units).

However, when treatment effects vary across individuals, random sampling only warrants sample selection bias but does not sample eliminate error.

21.10.5.2.2 Treatment Imbalance Error ( ΔT )

Also called treatment imbalance bias, this occurs when the empirical distribution of treated and control units differs.

ΔT=SATED

Key insight:
ΔT0 when the treatment and control groups are balanced across both observables (X) and unobservables (U).

Since we cannot directly adjust for unobservables, imbalance correction methods focus on observables.


21.10.5.3 Adjusting for (Observable) Treatment Imbalance

However, in real-world studies:

  • We can only adjust for observables X, not unobservables U.

  • Residual imbalance in unobservables may still introduce bias after adjustment.

To address treatment imbalance, researchers commonly use:

  1. Blocking
  2. Matching Methods

Method Blocking Matching Methods
Definition Random assignment within predefined strata based on pre-treatment covariates. Dropping, repeating, or grouping observations to balance covariates between treated and control groups (Rubin 1973).
When Applied? Before treatment assignment (in experimental designs). After treatment assignment (in observational studies).
Effectiveness Ensures exact balance within strata but may require large sample sizes for fine stratification. Can improve balance, but risk of increasing bias if covariates are poorly chosen.
What If Covariates Are Irrelevant? No effect on treatment estimates. Worst-case scenario: If matching is done on covariates uncorrelated with treatment but correlated with outcomes, it may increase bias instead of reducing it.
Benefits

Eliminates imbalance in observables (ΔTX=0).

Effect on unobservables is uncertain (may help if unobservables correlate with observables).

Reduces model dependence, bias, variance, and mean-squared error (MSE).

Matching only balances observables, and its effect on unobservables is unknown.


21.10.6 Average Treatment Effects on the Treated and Control

In many empirical studies, researchers are interested in how treatment affects specific subpopulations rather than the entire population. Two commonly used treatment effect measures are:

  1. Average Treatment Effect on the Treated (ATT): The effect of treatment on individuals who actually received treatment.
  2. Average Treatment Effect on the Control (ATC): The effect treatment would have had on individuals who were not treated.

Understanding the distinction between ATT, ATC, and ATE is crucial for determining external validity and for designing targeted policies.


21.10.6.1 Average Treatment Effect on the Treated

The ATT measures the expected treatment effect only for those who were actually treated:

ATT=E[Yi(1)Yi(0)|Di=1]=E[Yi(1)|Di=1]E[Yi(0)|Di=1]

Key Interpretation:

  • ATT tells us how much better (or worse) off treated individuals are compared to their hypothetical counterfactual outcome (had they not been treated).

  • It is useful for evaluating the effectiveness of interventions on those who self-select into treatment.

21.10.6.2 Average Treatment Effect on the Control

The ATC measures the expected treatment effect only for those who were not treated:

ATC=E[Yi(1)Yi(0)|Di=0]=E[Yi(1)|Di=0]E[Yi(0)|Di=0]

Key Interpretation:

  • ATC answers the question: “What would have been the effect of treatment if it had been given to those who were not treated?”

  • It is important for understanding how an intervention might generalize to untreated populations.

21.10.6.3 Relationship Between ATT, ATC, and ATE

Under random assignment and full compliance, we have:

ATE=ATT=ATC

Why?

  • Randomization ensures that treated and untreated groups are statistically identical before treatment.

  • Thus, treatment effects are the same across groups, leading to ATT = ATC = ATE.

However, in observational settings, selection bias and treatment heterogeneity may cause ATT and ATC to diverge from ATE.


21.10.6.4 Sample Average Treatment Effect on the Treated

The Sample ATT (SATT) is the empirical estimate of ATT in a finite sample:

SATT=1niDi=1TEi

where:

  • TEi=Yi(1)Yi(0) is the treatment effect for unit i.

  • n is the number of treated units in the sample.

  • The summation is taken only over treated units in the sample.

21.10.6.5 Population Average Treatment Effect on the Treated

The Population ATT (PATT) generalizes ATT to the entire treated population:

PATT=1NiDi=1TEi

where:

  • TEi=Yi(1)Yi(0) is the treatment effect for unit i.

  • N is the total number of treated units in the population.

  • The summation is taken over all treated individuals in the population.

If the sample is randomly drawn, then SATTPATT, but if the sample is not representative, SATT may overestimate or underestimate PATT.


21.10.6.6 When ATT and ATC Diverge from ATE

In real-world studies, ATT and ATC often differ from ATE due to treatment effect heterogeneity and selection bias.

21.10.6.6.1 Selection Bias in ATT

If individuals self-select into treatment, then the treated group may be systematically different from the control group.

  • Example:
    • Suppose a job training program is voluntary.
    • Individuals who enroll might be more motivated or have better skills than those who do not.
    • As a result, the treatment effect (ATT) may not generalize to the untreated group (ATC).

This implies:

ATTATC

unless treatment assignment is random.

21.10.6.6.2 Treatment Effect Heterogeneity

If treatment effects vary across individuals, then:

  • ATT may be larger or smaller than ATE, depending on how treatment effects differ across subgroups.
  • ATC may be larger or smaller than ATT, if the untreated group would have responded differently to treatment.

Example:

  • A scholarship program may be more beneficial for students from lower-income families than for students from wealthier backgrounds.

  • If lower-income students are more likely to apply for the scholarship, then ATT > ATE.

  • However, if wealthier students (who did not receive the scholarship) would have benefited less from it, then ATC < ATE.

Thus, we may observe:

ATEATTATC


Treatment Effect Definition Use Case Potential Issues
ATE (Average Treatment Effect) Effect on randomly selected individuals Policy decisions applicable to entire population Requires full randomization
ATT (Average Treatment on Treated) Effect on those who received treatment Evaluating effectiveness of interventions for targeted groups Selection bias if treatment is voluntary
ATC (Average Treatment on Control) Effect if treatment were given to untreated individuals Predicting treatment effects for new populations May not be generalizable

21.10.7 Quantile Average Treatment Effects

Instead of focusing on the mean effect (ATE), Quantile Treatment Effects (QTE) help us understand how treatment shifts the entire distribution of an outcome variable.

The Quantile Treatment Effect at quantile τ is defined as:

QTEτ=Qτ(Y1)Qτ(Y0)

where:

  • Qτ(Y1) is the τ-th quantile of the outcome distribution under treatment.

  • Qτ(Y0) is the τ-th quantile of the outcome distribution under control.

When to Use QTE?

  • Heterogeneous Treatment Effects: If treatment effects differ across individuals, ATE may be misleading.
  • Policy Targeting: Policymakers may care more about low-income individuals (e.g., bottom 25%) rather than the average effect.
  • Distributional Changes: QTE allows us to assess whether treatment increases inequality (e.g., benefits the rich more than the poor).

Estimation of QTE

QTE can be estimated using:

Example: Wage Policy Impact

  • Suppose a minimum wage increase is introduced.
  • The ATE might show a small positive effect on earnings.
  • However, QTE might reveal:
    • No effect at the bottom quantiles (for workers who lose jobs).
    • A positive effect at the median.
    • A strong positive effect at the top quantiles (for experienced workers who benefit the most).

Thus, QTE provides a more detailed view of the treatment effect across the entire income distribution.


21.10.8 Log-Odds Treatment Effects for Binary Outcomes

When the outcome variable is binary (e.g., success/failure, employed/unemployed, survived/died), it is often useful to measure the treatment effect in log-odds form.

For a binary outcome Y, define the probability of success as:

P(Y=1|D=d)

The log-odds of success under treatment and control are:

Log-odds(Y|D=1)=log(P(Y=1|D=1)1P(Y=1|D=1))

Log-odds(Y|D=0)=log(P(Y=1|D=0)1P(Y=1|D=0))

The Log-Odds Treatment Effect (LOTE) is then:

LOTE=Log-odds(Y|D=1)Log-odds(Y|D=0)

This captures how treatment affects the relative likelihood of success in a nonlinear way.

When to Use Log-Odds Treatment Effects?

  • Binary Outcomes: When the treatment outcome is 0 or 1 (e.g., employed/unemployed).
  • Nonlinear Treatment Effects: Log-odds help handle situations where effects are multiplicative rather than additive.
  • Rare Events: Useful in cases where the outcome probability is very small or very large.

Estimation of Log-Odds Treatment Effects

  • Logistic Regression with Treatment Indicator: log(P(Y=1|D=1)1P(Y=1|D=1))=β0+β1D where β1 represents the log-odds treatment effect.

  • Randomization-Based Estimation: Freedman (2008) provides a framework for randomized trials that ensures consistent estimation.

  • Attributable Effects: Alternative methods, such as those in (Rosenbaum 2002), estimate the proportion of cases attributable to the treatment.


21.10.9 Summary Table: Treatment Effect Estimands

Treatment Effect Definition Use Case Key Assumptions When It Differs from ATE?
Average Treatment Effect The expected treatment effect for a randomly chosen individual in the population. General policy evaluation; measures the overall impact. Randomization or strong ignorability (treatment assignment independent of potential outcomes). -
Conditional Average Treatment Effect The treatment effect for a specific subgroup of the population, conditional on covariates X. Identifies heterogeneous effects; useful for targeted interventions. Treatment effect heterogeneity must exist. Differs when treatment effects vary across subgroups.
Intention-to-Treat Effect The effect of being assigned to treatment, regardless of actual compliance. Policy evaluations where non-compliance exists. Randomized treatment assignment ensures unbiased estimation. Lower than ATE when not all assigned individuals comply.
Local Average Treatment Effect The effect of treatment only on compliers—those who take the treatment if and only if assigned to it. When compliance is imperfect, LATE isolates the effect for compliers. Monotonicity (no defiers); instrument only affects the outcome through treatment. Differs from ATE when compliance is selective.
Average Treatment Effect on the Treated The effect of treatment on those who actually received the treatment. Used when assessing effectiveness of a treatment for those who self-select into it. No unmeasured confounders within the treated group. Differs when treatment selection is not random.
Average Treatment Effect on the Control The effect the treatment would have had on individuals who were not treated. Predicts the effect of expanding a program to the untreated population. No unmeasured confounders within the control group. Differs when treatment effects are heterogeneous.
Sample Average Treatment Effect The estimated treatment effect in the sample. Used when evaluating treatment within a specific sample. Sample must be representative of the population for external validity. Differs when the sample is not representative of the population.
Population Average Treatment Effect The expected treatment effect for the entire population. Policy design and large-scale decision-making. Requires that sample selection is random. Differs when sample selection bias exists.
Quantile Treatment Effect The treatment effect at a specific percentile of the outcome distribution. Understanding distributional effects rather than mean effects. Rank preservation or monotonicity assumptions may be needed. Differs when treatment effects vary across outcome quantiles.
Log-Odds Treatment Effect The effect of treatment on binary outcomes, expressed in log-odds. Used when outcomes are dichotomous (e.g., employed/unemployed, survived/died). Logistic model assumptions must hold. Differs when treatment effects are nonlinear or outcome probabilities are low.

References

Abadie, Alberto, Joshua Angrist, and Guido Imbens. 2002. “Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings.” Econometrica 70 (1): 91–117.
Chernozhukov, Victor, and Christian Hansen. 2005. “An IV Model of Quantile Treatment Effects.” Econometrica 73 (1): 245–61.
Freedman, David A. 2008. “Randomization Does Not Justify Logistic Regression.” Statistical Science, 237–49.
Gerber, Alan S, Donald P Green, Edward H Kaplan, and Holger L Kern. 2010. “Baseline, Placebo, and Treatment: Efficient Estimation for Three-Group Experiments.” Political Analysis 18 (3): 297–315.
Imai, Kosuke, Gary King, and Elizabeth A Stuart. 2008. “Misunderstandings Between Experimentalists and Observationalists about Causal Inference.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 171 (2): 481–502.
Jin, Hui, and Donald B Rubin. 2008. “Principal Stratification for Causal Inference with Extended Partial Compliance.” Journal of the American Statistical Association 103 (481): 101–11.
———. 2009. “Public Schools Versus Private Schools: Causal Inference with Partial Compliance.” Journal of Educational and Behavioral Statistics 34 (1): 24–45.
Keele, Luke, and Richard Grieve. 2025. “So Many Choices: A Guide to Selecting Among Methods to Adjust for Observed Confounders.” Statistics in Medicine 44 (5): e10336.
Long, Qi, Roderick JA Little, and Xihong Lin. 2010. “Estimating Causal Effects in Trials Involving Multitreatment Arms Subject to Non-Compliance: A Bayesian Framework.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 59 (3): 513–31.
Rosenbaum, Paul R. 2002. “Attributing Effects to Treatment in Matched Observational Studies.” Journal of the American Statistical Association 97 (457): 183–92.
Rubin, Donald B. 1973. “The Use of Matched Sampling and Regression Adjustment to Remove Bias in Observational Studies.” Biometrics, 185–203.