Design and Analysis of Analytical Sample Surveys for Program Evaluation and Policy Analysis

 

Briefing Notes

 

Joseph George Caldwell

 

17 April 2022

 

Copyright © 2022 Joseph George Caldwell.  All rights reserved.

 

(Briefing Notes: Microsoft Word file design-and-analysis-of-analytical-sample-surveys-briefing-notes.docx, .htm, .pdf

Briefing: Microsoft PowerPoint file design-and-analysis-of-analytical-sample-surveys-briefing.pptx, pdf)

 

 

1. Context: Two Main Types of Sample Surveys, Categorized by Purpose

 

Descriptive surveys: estimate population characteristics, such as means and totals for the population and subpopulations of interest.  Associational inference: how variables are probabilistically related.  Probabilistic inference (probability models, statistical models and statistical analysis).

 

Analytical surveys: estimate parameters of models, such as the social and economic impact of a government program, or the effects of changes in government policies. Causal inference: how variables are causally related.  Causal inference (causal modeling and analysis; includes probability models and statistical inference).

 

2. Two Main Types of Sample-Survey Inference, Categorized by Dependence on the Survey Sampling Plan: Design-Based Inference and Model-Based Inference

 

Design-based inference: domain estimates are based on domain sample data and sampling plan (sampling units, selection methods and probabilities) (most sample surveys).

 

“Direct” estimates:

 

Without auxiliary data: Standard descriptive-survey estimates based solely on the sample responses and the survey sampling plan; or

With auxiliary data: Use regression (or ratio) estimates (response is associated with covariates; regression and ratio estimates).

 

The precision of direct estimates depends mainly on the domain sample size.  Direct estimates (and modified direct estimates, such as a regression estimate that uses regression coefficients obtained from the entire survey) have unacceptably large sampling errors for small domains (small or zero sample sizes).

 

Model-based inference: estimates are based on the survey data, and, to a varying degree, on the survey sampling plan and a data-generation model (a model additional to the statistical model describing sample selection and assignment to treatment).

 

“Indirect” estimates:

 

Synthetic estimates.  Use a direct estimate from a large area to make an estimate for a small area, under the assumption that the large and small areas have the same characteristics.  May or may not use auxiliary data.  Mean, ratio or regression estimates.  The precision of the indirect estimate for a domain depends very little on the domain sample size.

 

General statistical models that allow for random between-area variation (i.e., the full range of statistical models).

 

Composite estimates (weighted average of a direct estimate and an indirect estimate, where the weight is set to minimize the mean-squared error).

 

3. Two Classes of Model-Based Inference: Model-Assisted Inference and Model-Dependent Inference

 

(1) Model-assisted inference: primary goal is still estimation of population or subpopulation characteristics (descriptive), and the estimates are based to some extent on the survey sampling plan.

 

Treatment of nonresponse (unit or item nonresponse)

Small-area statistics (can be handled like a missing-data problem)

 

Examples of model-assisted inference:

 

Generalized Regression (GREG) estimation (uses a regression-type model, but with error terms based on survey design)

“Fay-Herriot”-type models (“mixed” models that includes both model errors and design-induced errors from survey sampling)

Latent-variable models, such as Tobit models (censoring, truncation, nonresponse)

Bayesian methods (missing values (unit and item nonresponse; small-area statistics; Markov Chain Monte Carlo (MCMC) (Gibbs sampling, Metropolis-Hastings algorithm; Expectation-Maximization (EM) algorithm)

References: Rao, J. N. K., Small Area Statistics (Wiley, 2003) and Särndel, Carl-Erik, Bengt Swensson and Jan Wretman, Model Assisted Survey Sampling (Springer, 1992)

 

3b. Two Classes of Model-Based Inference: Model-Assisted Inference and Model-Dependent Inference (Cont’d.)

 

(2) Model-dependent inference: primary goal is estimation of parameters of a process considered to have generated the population data (i.e., of an underlying causal model).  Estimates are based primarily on the survey data and a data-generation model, not on the survey sampling plan.  (In fact, dependence on the sampling plan, such as model parameters varying by stratum, would be considered as evidence of model misspecification.)

 

Types of Designs:

 

Experimental designs (randomization (for selection of sample units and assignment to treatment), replication, symmetry and local control)

“Broken” experimental designs and quasi-experimental designs (some features of experimental design are lacking or compromised, such as random selection and treatment assignment of primary sample units but not of secondary sample units)

Observational studies (randomization not used to select sample units or assign treatment)

Analytical Survey Designs (panel surveys; pretest-posttest-with-comparison-group)

 

Types of Analysis:

 

Latent-variable models (tobit models)

General linear statistical model (analysis of variance and covariance, regression analysis)

Generalized linear model (link functions; e.g., logistic regression)

Reference: Jeffrey M. Wooldridge, Econometric Analysis of Cross Section and Panel Data, 2nd ed. (The MIT Press, 2010)

 

This briefing is concerned with model-dependent inference, applied to Analytical Survey Design.

 

4. A Significant Problem Associated with Causal Inference: Causation Cannot Be Inferred from Data Alone

 

A causal model cannot be determined from the data alone.

 

Note that statistical causal inference is concerned with estimation of the effects of causes (interventions), not with identification of the causes of effects.

 

To estimate causal effects (i.e., the average effect of treatment on a randomly selected member of a population), it is necessary to specify a causal model that specifies the causal relationships among the variables, and to base inferences on this model.

 

This is done in some applications (experimental design, econometric modeling, time-series forecasting) but rarely for other types of investigations (sample surveys, analysis of observational data, data analytics, data mining).  With experimental designs, controlled interventions (changes) are made in certain variables using randomization, and the effects of these interventions on other variables are observed.  With econometric and time-series forecasting, attention focuses on assessment of exogeneity.

 

For descriptive surveys, this is not an issue (since it is not the goal to estimate causal effects).  For analytical surveys, it is a major concern, since it is the goal of these surveys is to estimate causal effects but it is generally not feasible to implement sample surveys as experimental designs.

 

Most statistical methods simply measure probabilistic associations among variables, not causal effects.  (Except for books on experimental design, the word “causal” rarely appears in statistics texts.)

 

Issue: How to make causal inferences from sample surveys when an experimental design is not used?

 

5. Standard Statistics Texts Do Not Address the Subject of Causal Inference for Sample Surveys

 

Texts on Sample Survey Design and Analysis Do Not Address the Subject (lacking in both causal modeling and analysis)

 

Cochran, William G., Sampling Techniques, 3rd ed. (Wiley, 1977)

Lohr, Sharon L., Sampling: Design and Analysis (Duxbury Press / Brooks/Cole, 1999)

Kish, Leslie, Survey Sampling (Wiley, 1965)

Thompson, Steven K., Sampling, 3rd ed. (Wiley, 2012)

Hanson, Morris H. William N. Hurwitz and William G Madow, Sample Survey Methods and Theory (Wiley, 1953)

Särndal, Carl-Erik, Bengt Swensson and Jan Wretman, Model Assisted Survey Sampling (Springer, 1992)

Valliant, Richard, Alan H. Dorfman and Richard M. Royall, Finite Population Sampling and Inference: A Prediction Approach (Wiley, 2000)

Rao, J. N. K., Small Area Estimation (Wiley, 2003)

Valliant, Richard, Jill A. Dever and Frauke Kreuter, Practical Tools for Designing and Weighting Survey Samples (Springer, 2013)

Longford, Nicholas T., Missing Data and Small-Area Estimation (Springer, 2005)

Little, Roderick J. A. and Donald B. Rubin, Statistical Analysis with Missing Data, 2nd ed. (Wiley, 2002)

 

Texts on Causal Inference Do Not Address the Subject in the Context of Sample Survey

 

Pearl, Judea, Causality: Models, Reasoning, and Inference, 2nd ed. (Cambridge University Press, 2009)

Imbens, Guido W. and Donald B. Rubin, Causal Inference for Statistics, Social and Biomedical Sciences: An Introduction (Cambridge University Press, 2015)

Morgan, Stephen L., Christopher Winship, Counterfactuals and Causal Inference: Methods and Principles for Social Research, 2nd ed. (Cambridge University Press, 2015)

Lee, Myoung-Jae, Micro-Econometrics for Policy, Program and Treatment Effects (Oxford University Press, 2005)

Angrist, Joshua D and Jörn-Steffen Pischke, Mostly Harmless Econometrics: An Empiricist’s Companion (Princeton University Press, 2009)

Wasserman, Larry, All of Statistics: A Concise Course on Statistical Inference (Springer, 2004)

 

Texts on Econometrics Address the Subject, but Only for the Model-Dependent Case (not for the model-assisted case)

 

Wooldridge, Jeffrey M., Econometric Analysis of Cross Section and Panel Data, 2nd ed. (The MIT Press, 2010)

Greene, William H., Econometric Analysis, 7th ed. (Pearson Education / Prentice Hall 2012)

Heckman, James J. and Edward J. Vytlacil, “Econometric Evaluation of Social Programs, Part I: Causal Models, Structural Models and Econometric Policy Evaluation” in Handbook of Econometrics Volume 6B, eds. James J. Heckman and Edward E. Leamer (North-Holland / Elsevier 2007)

 

Texts on Analysis of Observational Data and Quasi-Experimental Design Do Not Address the Subject (short on causal modeling and analysis, heavy on propensity-score matching)

 

Rosenbaum, Paul R., Observational Studies, 2nd ed. (Springer, 2002)

Rosenbaum, Paul R., Design of Observational Studies (Springer, 2010)

Rosenbaum, Paul R., Observation and Experiment (Harvard University Press, 2017)

 

6. A Major Problem in Analytical Survey Design: Lack of Technical References

 

Situation summary:

 

The science of statistics is focused on estimation of the strength of probabilistic associations between variables.  Except for texts on experimental design, most statistics texts do not address the issue of estimating causal effects.  There does not exist a body of literature on the subject of Analytical Survey Design.  This briefing will describe a methodology developed and used by the author for design and analysis of analytical sample surveys.  First, we shall review the general theory on the subject of causal inference without experimental designs.

 

Let us now review the general theory on this topic.

 

7. Causal Inference without Experimental Designs: Must Be Based on a Causal Model

 

George Box once asserted (1966), “To find out what happens to a system when you interfere with it you have to interfere with it (not just passively observe it).

 

Paul Holland and Donald Rubin coined the aphorism (1986), “No causation without manipulation.”

 

Randomized assignment of treatment enables causal inference by assuring that the probability distribution of all variables is the same for treatment and control groups (i.e., response (outcome) is independent of all variables except treatment).

 

In the absence of randomized intervention, causal inference about a system must be based on assumptions about the causal nature of the system, i.e., on a causal model of the system.

If the causal model is reasonable, then inferences based on the model should be reasonable.

 

A number of causal models have been developed.  All of them involve the concept of conditional independence of treatment assignment (selection) and treatment response, given covariates, but they differ in other respects.

 

We shall now discuss some of these models.

 

7b. Causal Inference without Experimental Designs: Major Methodologies

 

Neyman-Rubin Causal Model (potential outcomes, counterfactuals).  Reference: Holland, Paul W. and Donald Rubin, “No causation without manipulation,” Journal of the American Statistical Association, Vol. 81, No. 396 (Dec. 1986), pp 945-960.)

 

Rosenbaum-Rubin approach (matching approach, balancing approach, “statistical” approach).  Reference: Rosenbaum, Paul R. and Donald B. Rubin, “The central role of the propensity score in observational studies for causal effects,” Biometrika (1983), vol. 70, no. 1, pp. 41-55.

 

James Heckman approach (regression approach, “econometric” approach).  Reference: Heckman, James J. and Edward J. Vytlacil, “Econometric Evaluation of Social Programs, Part I: Causal Models, Structural Models and Econometric Policy Evaluation” in Handbook of Econometrics Volume 6B, eds. James J. Heckman and Edward E. Leamer (North-Holland / Elsevier 2007).

 

Judea Pearl’s methodology (structural causal models; specification of causal models using Bayesian networks and Directed Acyclic Graphs (DAGs)).  Reference: Pearl, Judea, Causality: Models, Reasoning, and Inference, 2nd ed. (Cambridge University Press, 2009).

 

7c. Causal Inference without Experimental Designs: Some Definitions

 

Propensity Score (PS): the probability of selection for treatment (in the case of two treatment levels)

Key use: In groups of sample units having the same PS, the difference between means of the treated and untreated units is an unbiased estimate of the causal effect of treatment for the group.

So, if we stratify on the PS, we can obtain an estimate of the causal effect over the whole population.

Key assumption: 0 < PS < 1 for all units.  (Every unit has a positive probability of being assigned to treatment or control.)

 

Potential outcomes (for binary case): For each unit of the population, there are two hypothetical outcomes, corresponding to the two treatment levels (treatment and control).  After the experiment, one of them is observed.  The unobserved one is called a counterfactual outcome.

 

Potential-outcomes models have been subject to criticism, e.g., Dawid, A. Philip, “Causal Inference without Counterfactuals,” Journal of the American Statistical Association, June 2000, 95, 450, pp. 407-448 / Comments by D. R. Cox et al. / Rejoinder.

 

Which model is correct?  None of them.  George Box: “All models are wrong, but some are useful.”

 

7d. Causal Inference without Experimental Designs: Matching

 

Causal inference makes much use of matching, both in design and in analysis, both to increase precision and decrease selection bias.

 

Matching in analysis: There are a variety of different methods of matching.  They are described in articles posted on Professor Gary King’s website ( http://gking.harvard.edu ), including:

 

"Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference," by Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart, Political Analysis, Vol. 15 (2007), pp. 199-236, posted at http://gking.harvard.edu/files/matchp.pdf or http://gking.harvard.edu/files/abs/matchp-abs.shtml; and

 

“MatchIt: Nonparametric Preprocessing for Parametric Causal Inference,” by Daniel E. Ho, Kosuke Imai, Gary King, and Elizabeth A. Stuart (July 9, 2007), posted at http://gking.harvard.edu/matchit/docs/matchit.pdf.

 

Matching in design: For a discussion of the drawbacks of propensity-score matching, see the article:

 

“Why Propensity Scores Should Not Be Used for Matching,” by Gary King and Richard Nielsen, Political Analysis (Volume 27, Issue 4, October 2019, pp. 435 – 454) posted at https://gking.harvard.edu/files/gking/files/pan1900011_rev.pdf and supplementary material posted at https://gking.harvard.edu/files/gking/files/psnot-supp.pdf.)

 

Most of the published material on causal inference is concerned with analysis, not with design.  There is no standard reference text that presents a detailed or comprehensive description of procedures or general methodology for constructing analytical survey designs.

 

This author presents a general methodology in the paper, Sample Survey Design for Evaluation (The Design of Analytical Surveys) posted at Internet website http://www.foundationwebsite.org/SampleSurveyDesignForEvaluation.htm.  That methodology is summarized in this briefing.

 

7e. Causal Inference without Experimental Designs: Features of Major Approaches

 

Potential-Outcomes Approach (Rubin, Rosenbaum, Heckman): Do not specify a comprehensive causal model.  Check estimability / identifiability by applying tests for exogeneity (and other tests, such as rank and order tests).  These exogeneity tests can be extremely difficult to apply, both from a substantive (subject-matter) and technical (statistical) perspective.  The sample design proceeds ignorant of a comprehensive causal model.  The exogeneity tests are often applied during the analysis phase, not in the design phase.  Matched pairs are often formed using propensity-score matching (a terrible approach!).

 

Structural Causal Model Approach (Pearl): Specify a comprehensive causal model.  Check estimability / identifiability from a DAG (not from exogeneity tests on individual variables).  The sample design is consistent with and guided by the causal model.  Matched pairs are formed by taking all causal variables of the causal model into account, not just matching on a single variable (i.e., on the propensity score).

 

(Matching on the true propensity score would be reasonable to form a matched comparison group, but estimates based on a matched-comparison-group design are generally much less precise than those based on a matched-pairs design.  The estimated propensity score will omit unobserved variables.  The fundamental reason for matching on the propensity score is to obtain a matched comparison for which the distributions of all variables except treatment are the same as for the treatment group.  Since this goal cannot be achieved when there are unobserved variables, a far better approach is to set up the design so that those variables drop out of estimates of interest (e.g., by selecting the same respondents in a second survey round and using a difference estimate).)

 

7f. Causal Inference without Experimental Designs: Pros and Cons of Alternative Approaches

 

Rosenbaum-Rubin: Focuses on identifying groups for which the distributions of all variables except treatment are the same (conditional independence, “ignorability” of treatment); propensity score.

Good for estimating the average effect of treatment, if all variables associated with selection for treatment are observable.

No procedures for assessing estimability – just conditions (how to test?).

Recursive model (no mutual causation or “loops”).

Better for program evaluation.

 

Heckman: Focuses on investigation of the relationship of causal variables, not just overall effects.  Emphasizes use of design to remove influence of unobserved variables on selection for treatment.  Specifies exogeneity conditions for estimability.

No general methodology for assessing estimability.

Allows for mutually causal variables (nonrecursive, or simultaneous, causal models).

Better for policy analysis.

 

Pearl: Focuses on specification of a structural causal model in the form of a Directed Acyclic Graph (DAG), and identifies practical tests for estimability using graphical techniques.

Greater face validity: Based on observable conditional distributions, not unobservable counterfactuals.

Recursive model.

Equally useful for program evaluation or policy analysis.

 

Both the Rosenbaum and Rubin (R&R) and Heckman approaches are based on causal models.  Both approaches are correct if the respective assumptions are justified.  The R&R approach is simpler, since it simply identifies the observed variables that affect selection (“selection on observables”) and then stratifies on them (or regresses) on the propensity score.  The Heckman approach is more general since it addresses the issue of unobserved variables that affect selection (that is, it encompasses both “selection on observables” and “selection on unobservables”).  It is also more general, in that it

·       assesses distributional aspects of impact

·       estimates the effects of policy-relevant variables other than a single program intervention indicator (treatment) variable

·       allows for mutually causal variables (nonrecursive, or simultaneous, causal models).

 

7g. Causal Inference without EDs: Assessment of Exogeneity without Graphs

 

7h. Causal Inference without Experimental Designs: Some Comments on Causal Inference

 

Some comments on causal inference:

 

By itself, a probability model does not specify causal relationships.  It specifies just associational relationships.  In order to make causal inferences, additional information is required.  There are two basic approaches to causal inference.  The first approach, often called the "statistical" approach, is to specify conditions (such as conditional independence) under which particular estimates are estimates of causal effects.  The second approach to causal inference, which may be called the "causal modeling" approach, is to specify a complete causal model, and then derive causal estimates from the model (and data).  (By "complete" is meant a model that identifies all major variables affecting outputs of interest, and their causal relationships, in situations of interest.)

 

The first approach (the "statistical" approach) is a "minimalist" approach, since it requires fewer assumptions.  Unfortunately, this apparent simplicity is illusory, since, absent an explicit causal model, it is difficult to justify those assumptions. For example, it is easier to defend an assumption of conditional independence (needed to justify certain causal estimates) from a complete description of a causal model and a sampling scheme, than in the absence of a description of the model.

 

In some instances, it is quite unnecessary to specify a complete causal model (i.e., all of the major variables that affect outcome).  For example, in a designed experiment, with randomization used to specify the levels of explanatory variables, the model specification may be restricted to the output variables of interest and the randomized explanatory variables (ignoring all other variables that affect outcome).  (A more detailed model, including covariates, could be considered, to improve precision of estimates, but this is not necessary.)  With randomization and orthogonality of treatment levels, the treatment effect estimates are unbiased estimates of causal effects.

 

8. Example of a Causal Model Represented as a Directed Acyclic Graph

 

 

8b. Example of a Causal Model Represented as a Directed Acyclic Graph

 

 

8c. Example of a Causal Model Represented as a Directed Acyclic Graph

 

 

9. Challenges in Applying Causal-Inference Theory to Analytical Survey Design

 

Objectives are similar to those of experimental design (randomization, replication, symmetry (orthogonality, balance), and local control.

 

Where randomization cannot be used, use a causal model to identify and estimate causal effects, and base matching on the causal model variables (for local control).

 

Use propensity scores in analysis, but not as a basis for forming matched pairs in design.

 

Use design features to remove effect of unobservable variables affecting both selection for treatment and outcome.

 

To achieve orthogonality (low correlation) and balance (spread, variation) in sampling from finite populations, use marginal stratification with variable probabilities of selection.

 

10. Methodology for Designing Analytical Sample Surveys

 

Most of the published material on causal inference is concerned with analysis, not with design.

 

There is no standard reference text that presents a detailed or comprehensive description of procedures or general methodology for constructing analytical survey designs.

 

This author presents a general methodology in the paper:

Sample Survey Design for Evaluation (The Design of Analytical Surveys) posted at Internet website http://www.foundationwebsite.org/SampleSurveyDesignForEvaluation.htm.

 

Additional material is presented in lecture notes for the courses:

 

Causal Inference and Matching, at http://www.foundationwebsite.org/StatCourse4and5CausalInferenceAndMatching.htm; and

 

Statistical Design and Analysis for Evaluation, at http://www.foundationwebsite.org/StatCourse6and7StatisticalDesignAndAnalysisForEvaluation2DayCourse.htm.

 

That methodology will now be summarized.  It includes elements of all major approaches to causal inference, of experimental design, and of sample survey design.

 

11. Summary of Procedures for Designing Analytical Sample Surveys

 

1. Construct a comprehensive causal model for the process under investigation.  Represent it as a DAG.  Classify variables as observable and unobservable.  Construct a survey design such that unobservable variables will drop out of estimates of interest (e.g., interviewing the same subjects in successive survey rounds of a panel survey if selection is associated with personal characteristics).

 

2. Identify causal effects of interest, and minimal detectable effect sizes for each (i.e., effect sizes that are to be detectable with high probability).

 

3. Use statistical power analysis to determine sample sizes for the survey design. (Allow for nonresponse.)

 

4. A computer program for determining sample sizes for evaluation designs (e.g., pretest-posttest-comparison-group design) is posted at http://www.foundationwebsite.org/SampleSizeEstimationProgram.htm.  Summary information about the program is posted at http://www.foundationwebsite.org/SampleSizeEstimationAnalyticalSurveysGeneric.htm.  Lecture notes on a course in determination of sample size for evaluation surveys are posted at http://www.foundationwebsite.org/StatCourse8SampleSizeDetermination.htm.

 

11b. Summary of Procedures for Designing Analytical Sample Surveys (Cont’d.)

 

5. Identify variables that are causally related to output variables of interest, and for which data are available prior to the survey data collection (i.e., that can be used for design).  Do this for each stage of sampling.  (Example: In an evaluation of an agricultural training program, variables such as land elevation, rainfall, temperature and slope may be causally related to impact, and be of interest to planners and policymakers.  These variables are available from Geographic Information System databases, and may be used in the survey design process.)

 

6. Define strata for these variables.  Typically, use from three to five strata, defined by natural boundaries or Neyman boundaries (which are set to form equal intervals on the scale of the cumulative square root of the frequencies).  The stratification for each variable is a marginal stratification, not a cross-stratification or nested stratification.  Cross-stratification (such as Kish’s controlled selection) and nested stratification are not feasible since, for 5-10 variables, there would be a very large number of stratum cells, leading to few or one or no population items in many cells.  (With marginal stratification, the total number of stratum cells is the sum of the number of stratum cells for each variable of stratification; for cross-stratification it is the product.)  Since all units must have a nonzero probability of selection (and, preferably, at least two units selected from each stratum, for variance estimation of descriptive statistics), this situation would lead to very large sample sizes, even with collapsing of many strata.  (An example of the “Curse of Dimensionality.”)

 

7. Select from this set of variables a subset having low correlations.  (As a measure of association, use the Cramér phi (φc, V) correlation coefficient, applied to the stratum cells.)  This set typically contains 5-10 variables.

 

8. For each variable, allocate sample units to the stratum cells in such a way as to achieve a high degree of variation.

 

11c. Summary of Procedures for Designing Analytical Sample Surveys (Cont’d.)

 

9. Determine selection probabilities for each sample unit to achieve the desired marginal stratifications.  (Keep variation in probabilities as low as possible.  If the survey is to produce descriptive estimates as well as analytical estimates, it may be desirable to place a “floor” on how small the unit selection probabilities may be.)

 

10. If matching is used to construct matched pairs, then base matching on a distance measure that takes into account the relative importance of each variable of stratification on output measures of interest.  (Use strata that are sufficiently “coarse” that there are lots of reasonable match candidates.)  (Note: The use of importance weights in the matching distance function increases the precision of causal estimates and does not introduce bias.)

 

In general, do not match on the propensity score, for the following reasons:

1. The estimated propensity score (PS) is based on observables, and in many surveys selection is based on unobservables.  Matching on the observed PS may be very ineffective in reducing selection bias.  Configure the design to eliminate bias from unobservables (e.g., if respondents can self-select, then use the same respondents in successive survey waves, and a difference estimator, so that unobserved respondent variables drop out).

2. Matching on the PS is designed to construct comparison groups having the same distributions of causal variables as the treatment group.  It is not useful for construction matched pairs.  Groups matched on a single variable (such as the PS) will not match well on other variables.  If used to construct matched pairs, PS matching will generally produce very poor matches, leading to decreased precision of estimates of differences.  For matched pairs, use a matching method that paired units having similar values of causal variables.

3. The PS should be used in the analysis to reduce selection bias, not in the design.

4. See King/Nielsen article for additional discussion.

 

11. If a “treatment” sample has not yet been selected, use matching to define matched pairs, select the pairs with probabilities such that the marginal-stratification sample allocations are reasonable, and randomly allocate one member of each pair to treatment and one to control.

 

12. If a treatment sample has already been selected, use matching to define matched pairs.

 

11d. Summary of Procedures for Designing Analytical Sample Surveys (Cont’d.)

 

13. In the analysis, to obtain consistent estimates of causal effects, we must condition on (average on) either: (1) all variables affecting output; or (2) all variables affecting selection; or (3) all variables affecting both output and selection.  Make sure that such variables, if observable, are reflected in the variables of stratification.

 

14. For unobserved variables (e.g., farmer characteristics that might affect selection for treatment), configure the survey design so that these variable “drop out” of difference estimates.  All causal variables involved in estimation of a causal effect must be conditioned on or “drop out.”

 

15. Explicitly describe the inferential scope of the study.  For example, if selection for treatment is random and countrywide, the scope of inference will be the causal effect of the project / program intervention relative to the entire country.  If a treatment group has already been selected (e.g., by political means) prior to the sample design and selection, then the scope of inference will be the causal effect of that particular already-selected project.

 

12.  Tips on the Analysis

 

1. Analytical surveys may be designed to enable production of descriptive estimates as well as causal estimates.  Since the person doing the analysis may not be the person who constructed the design, it is very important to verify that the analysis procedures are correct for the design.  A common error in analysis of data sets involving matched pairs is to analyze the sample as if the design involved matched comparison groups, not matched pairs.

 

2. Check the estimability of each effect of interest relative to the causal model.

 

Assessment of estimability is readily accomplished from a DAG (back-door criterion and the front-door criterion).


 

12b.  Tips on the Analysis (Cont’d.)

 


 

12c.  Tips on the Analysis (Cont’d.)

 

An alternative way of assessing the validity of estimates is to specify features of the joint probability distribution of the model variables.  Under this approach, three different situations are described, called exogeneity conditions, relative to estimation of a specified parameter of interest.

 

The three main types of exogeneity are discussed in the book, Co-integration, Error-Correction, and the Econometric Analysis of Non-Stationary Data by Anindya Banerjee, Juan Dolado, John W. Galbraith, and David F. Hendry (Oxford University Press, 1993).  These are discussed in the context of making forecasts from an estimated model.  They are:

 

Weak exogeneity.  In this situation, forecasts may be made based on the conditional distribution given the future values of the explanatory variables.

 

Strong exogeneity.  In this situation, forecasts may be made conditional on forecasted future values of the explanatory variables.

 

Super exogeneity.  In this situation, forecasts may be made conditional on specified future values of the explanatory variables, where the marginal distribution of the explanatory variables may be altered.  (Super exogeneity is needed for policy analysis involving making changes to the distributions of explanatory variables.)

 

While these exogeneity types can be defined in terms of theoretical factorization properties of the joint distribution, there is in general no practical way of assessing whether these conditions hold in practice.  With the Pearl approach, once a causal model diagram is specified in terms of a DAG, the estimability criteria may be readily assessed from visual inspection of the DAG.  (Note that identifiability (estimability) and exogeneity are relative to estimation of a particular parameter (causal effect).)

 

12d.  Tips on the Analysis (Cont’d.)

 

3. Determine what variables must be conditioned on from the causal model.  This is easily done from the DAG.  For example, for estimating a treatment effect, condition on variables in such a way as to “block” all paths connecting outcome and treatment.

 

4. Use propensity scores to adjust for selection bias, using stratification, inverse weighting, and / or regression.  (Selection bias introduced by unobserved variables is handled in the design, so that the unobserved variables drop out of difference estimates.)

 

5. When using fixed-effects estimators (such as single-difference or double-difference estimators), conduct a Hausman specification test.   (A problem with the fixed-effects (difference) estimator is that effects cannot be estimated for variables that do not change between the two survey rounds.  Another problem is that the fixed-effects estimator is less efficient than the random-effects estimator.  If unobserved variables are present that may be correlated with model explanatory variables, the random-effects estimator may be inconsistent, whereas the fixed-effects estimator is consistent.  If the Hausman test passes (no significant difference between the estimates), the more efficient estimator, which is the random-effects estimator, may be used.)

 

12e.  Tips on the Analysis (Cont’d.)

 

6. The parameter estimates are complex, and closed-form expressions will not be available for variances.  Use resampling (bootstrapping) methods to estimate variances and significance levels.  Take care to take all design features into account (e.g., a common error is to analyze a matched-pairs design as if it were a matched-comparison-group design).

 

7. Consider multiple estimators (“statistical,” or “balancing” estimates, such as stratification, inverse weighting, or regression using propensity scores; “econometric” estimates, such as Heckman-type models).  Do not, however, present multiple estimates for the same parameter / effect in the final report.  The Rosenbaum-Rubin “statistical” (matching, balancing) approach is used if all that is desired is to estimate an average causal effect; the Heckman “econometric” approach is used if it is desired to quantify the relationship of causal effects to policy variables.

 

8. Conduct a detailed ex-post statistical power analysis for all causal effects of interest.

 

9. Assess crucial assumptions, such as the “overlap” condition (that the probabilities of selection of treatment and comparison units are not equal to zero or one; this is done by comparing the supports of the distributions of propensity scores for the treatment and comparison groups), and the stable-unit-treatment-value assumption (SUTVA) that responses conditional on explanatory variables are uncorrelated.

 

13. Examples of Analytical Survey Designs Constructed Using the Method Described Above

 

Impact Evaluation of the Farmer Training and Development Activity in Honduras, Millennium Challenge Corporation.  Project final report at http://www.foundationwebsite.org/MCCFTDAEvaluationFinalReportRevisedNov15-2013.htm.

 

Honduras Road Transportation Improvement Project, Millennium Challenge Corporation.  Project final report at http://www.foundationwebsite.org/MCCTransportationProjectEvaluationFinalReportRevisedDec12-2013.htm.

 

Impact Evaluation of the Competitive African Cotton for Pro-Poor Growth Program ("COMPACI”, “Cotton Made in Africa”), Deutsche Investitions und Entwicklungsgesellschaft GmbH (DEG), in six African countries: Benin, Burkina Faso, Côte d’Ivoire, Zambia, Ghana and Malaŵi.  (Separate surveys in each country.)

 

Monitoring and Evaluation of the Competitive African Cashew Value Chains for Pro-Poor Growth Program”, Deutsche Gesellschaft für Technische Zusammenarbeit (GTZ) GmbH, in five African countries: Benin, Burkina Faso, Côte d’Ivoire, Ghana and Mozambique.  (Separate surveys in each country.)

 

13b. Examples of Analytical Survey Designs Constructed Using the Method Described Above (Cont’d)

 

Impact Evaluation of the Programme of Advancement through Health and Education (PATH), Jamaica.  Government of Jamaica.

 

Evaluation des performances et de l’impact de l’activité de rehabilitation et d’intensification des plantations d’oliviers au niveau des zones pluviales,” Agence du Partenariat pour le Progrès, Millennium Challenge Account – Maroc, Project Arboriculture Fruitière.

 

Impact Evaluation of Agricultural Development Projects in the Sourou Valley and Comoé Basin, Millennium Challenge Account – Burkina Faso.

 

Impact Evaluation of Conservancy Support and Indigenous Natural Products, Millennium Challenge Account – Namibia.

 

Impact Evaluation of Ghana Water Supply Activity, Millennium Development Authority – Ghana.

 

Impact Evaluation of Feeder Roads Activity, Millennium Development Authority – Ghana.

 

14. Example of Output from Software for Constructing Analytical Sample Designs

 

Here follows an example of output from the SurvDes program posted at http://www.foundationwebsite.org/index12-design-of-analytical-sample-surveys.htm.  The SurvDes program is a Microsoft Access program that must be modified for each application.

 

This example draws from the survey design constructed for the Impact Evaluation of the COMPACI (Cotton Made in Africa) Benin Project.

 

First-stage sample unit: village

Second-stage sample unit: farmer

 

Sample design: Pretest-posttest-with-matched-pairs (matching of villages)

First-stage sample of 20 treatment villages selected with variable probabilities (to achieve desired marginal stratification)

First-stage sample of 20 matching villages (matched pairs): desired sample of 16, plus four possible replacements).

Second-stage sample of 16 farmers selected from each sample village using systematic selection with a random start

Sample sizes at both sampling stages determined by statistical power analysis.  (Article, Determination of Sample Size for Analytical Surveys Using a Pretest-Posttest-Comparison-Group Design, posted at http://www.foundationwebsite.org/SampleSizeEstimationAnalyticalSurveysGeneric.htm.  Program, Sample Size Analysis Program,   posted at http://www.foundationwebsite.org/SampleSizeEstimationProgram.htm.)

 

Data sources for sample design: program (CMiA) data and vendor GIS data

 

Selected design variables (from causal model): SisterCommune, NumFarmers, RevenueIndex, EducationIndex, LongevityIndex, Precip, Temp, Elev, Yield, and Vegetable Productivity Index (VPI).

 

Farmer characteristics (e.g., ambition, education, wealth, background) would affect decision to participate in program.  Use before-and-after sample of farmers and difference estimator to remove farmer self-selection bias.  (Note: Resampling of farmers eliminates self-selection bias and increases precision.  Resampling of villages increases precision (selection bias is not an issue).)

 

Here follow some comments on the survey design process and a few tables constructed by the SurvDev program.

 

To facilitate stratification and matching, the variables are recoded into a set of up to nine different values.  These recoded values are used both for matching and for stratification.  The recoding of the variables is as follows (code value is followed by original values; intervals open on left and closed on right):

 

SisterCommune. 1; 2; 3 (original values, not recoded)

NumFarmers.  0: 75-; 1: 76-200; 2:201+.

RevenueIndex. 0: .35-; 1: .35-.38; 2: .38+.

EducationIndex. 0: .25-; 1: .25-.32; 2: .32+.

LongevityIndex. 0:.55-; 1: .55-.65; 2: .65+.

Precip. 0: 1000-; 1: 1000-1150; 2: 1150+.

Temp. 0: 26.6-; 1: 26.6-28.2; 2: 28.2+.

Elev. 0: 200-; 1: 200-400; 2: 400+.

Yield. 0: 800-; 1: 800-1100; 2: 1100+.

VPI. 0: 130-; 1; 130-140; 2: 140+.

 

For all variables except NumFarmers, natural boundaries (determined by inspection) were used for the coding.  The procedure for NumFarmers was as follows.

 

Ordinarily, with a fixed number of farmers selected from each village, the first-stage sample units would be selected with probabilities proportional to size (number of farmers).  That approach does not work here, since the selection probabilities are fixed to achieve expected marginal stratification constraints at the village level, not to make the probabilities of selection uniform at the farmer level.  This latter objective is achieved instead by stratifying on the number of farmers (NumFarmers).  A good method for setting the stratum boundaries in this case is to use the so-called Neyman boundaries (which are set to form equal intervals on the scale of the cumulative square root of the frequencies (sizes)).  The stratum boundaries specified above for NumFarmers are the Neyman boundaries.

 

For the purpose of matching, the design variables were classified into three groups of variables related from the viewpoint of a causal model relating them to outcome.  Group 1 consisted of the Yield and VPI; group 2 consisted of the three physiographic variables Precipitation, Temperature and Elevation; and group 3 consisted of all the remaining demographic / social-indicator variables, SisterCommune, Number of Farmers, Revenue Index, Education Index, and Longevity Index.  Group 1 was assigned a relative weight of 3, group 2 a relative weight of 2 and group 3 a relative weight of 1.  Within these three groups, the relative weight of each component variable is as specified in Table 2, “Matching Importance Weights.”  (The group weight is split among the members of the group.  Within a group, variables (or subgroups of related variables) considered to have the strongest relationship to outcome are assigned the highest weights.)  It is noted that the results of matching are not highly sensitive to modest variations in the relative magnitudes of the weights, as long as the ranking of the weights remains the same.  (Note that the use of importance weights in the matching procedure increases precision and does not introduce bias.)

 

Table 1. Basic Statistics for the Treatment Population

 

Field Definitions:

IDNO: Record identification number

Name: Statistic

Other field (column) headings (“SisterCommune,” “NumFarmers,” etc.) are as defined in the text.

Last four rows of table contain Cramér coefficient of correlation.

 

 

IDNO

Name

SisterCommune

NumFarmers

RevenueIndex

EducationIndex

LongevityIndex

Precip

Temp

Elev

Yield

VPI

1

Mean

2.17

124.36

0.36

0.28

0.64

1088.14

27.18

309.26

960.05

131.75

2

Standard Deviation

0.78

159.88

0.02

0.05

0.13

73.14

0.88

96.25

198.72

12.30

3

Mode

3.00

25.95

0.36

0.28

0.64

1104.70

26.50

355.27

1079.02

132.70

4

Minimum

1.00

1.00

0.34

0.18

0.51

941.00

25.50

146.00

642.00

97.00

5

Maximum

3.00

891.00

0.42

0.33

0.90

1248.00

28.60

525.00

1205.00

179.00

6

Range

2.00

890.00

0.08

0.15

0.39

307.00

3.10

379.00

563.00

82.00

7

Median

2.00

66.00

0.36

0.31

0.57

1097.00

26.70

342.66

1054.00

132.45

8

Tritile1

2.00

40.00

0.35

0.27

0.57

1056.00

26.57

222.00

872.00

126.32

9

Tritile2

3.00

101.00

0.38

0.32

0.60

1104.99

28.10

358.14

1110.00

132.48

10

Quartile1

2.00

31.00

0.35

0.23

0.54

1024.01

26.41

197.34

872.00

126.32

11

Quartile2

2.00

66.00

0.36

0.31

0.57

1097.00

26.70

342.66

1054.00

132.45

12

Quartile3

3.00

128.00

0.38

0.32

0.77

1150.00

28.15

368.00

1110.00

133.00

13

Quintile1

1.00

27.00

0.34

0.23

0.54

1006.00

26.41

184.52

642.00

124.00

14

Quintile2

2.00

48.00

0.36

0.27

0.57

1077.00

26.57

320.00

872.00

132.25

15

Quintile3

3.00

78.00

0.36

0.32

0.60

1103.80

27.24

358.14

1054.00

132.45

16

Quintile4

3.00

180.00

0.38

0.33

0.77

1172.19

28.35

406.61

1140.00

143.00

17

Number of values

3.00

100.00

6.00

6.00

6.00

46.00

29.00

46.00

6.00

39.00

20

Cramer2SisterCommune

1.00

0.49

0.18

0.55

0.81

0.26

0.34

0.57

0.75

0.08

21

Cramer2NumFarmers

0.49

1.00

0.30

0.30

0.29

0.10

0.02

0.16

0.33

0.13

22

Cramer2RevenueIndex

0.18

0.30

1.00

0.12

0.08

0.51

0.45

0.24

0.04

0.03

23

Cramer2EducationIndex

0.55

0.30

0.12

1.00

0.76

0.19

0.12

0.11

0.28

0.05

24

Cramer2LongevityIndex

0.81

0.29

0.08

0.76

1.00

0.07

0.17

0.44

0.60

0.16

25

Cramer2Precip

0.26

0.10

0.51

0.19

0.07

1.00

0.93

0.76

0.54

0.10

26

Cramer2Temp

0.34

0.02

0.45

0.12

0.17

0.93

1.00

0.79

0.61

0.19

27

Cramer2Elev

0.57

0.16

0.24

0.11

0.44

0.76

0.79

1.00

0.78

0.43

28

Cramer2Yield

0.75

0.33

0.04

0.28

0.60

0.54

0.61

0.78

1.00

0.46

29

Cramer2VPI

0.08

0.13

0.03

0.05

0.16

0.10

0.19

0.43

0.46

1.00

 

 


Table 2.  Matching Importance Weights

 

Field Definitions:

IDNO: Record identification number

Name: Variable Name & MatchingImportanceWeights

Weight: Relative importance of variable “Variable Name” to program outcome

 

IDNO

Name

Weight

1

Total_MatchingImportanceWeights

0

2

Treatment_MatchingImportanceWeights

0

3

Control_MatchingImportanceWeights

0

4

SisterCommune_MatchingImportanceWeights

0.5

5

NumFarmers_MatchingImportanceWeights

0.2

6

RevenueIndex_MatchingImportanceWeights

0.1

7

EducationIndex_MatchingImportanceWeights

0.1

8

LongevityIndex_MatchingImportanceWeights

0.1

9

Precip_MatchingImportanceWeights

1

10

Temp_MatchingImportanceWeights

0.5

11

Elev_MatchingImportanceWeights

0.5

12

Yield_MatchingImportanceWeights

1.5

13

VPI_MatchingImportanceWeights

1.5

 

 


Table 3. Population Frequencies

 

Field Definitions:

IDNO: Record identification number

Name: Variable name & _Population

Last 10 columns: stratum code

 

IDNO

Name

0

1

2

3

4

5

6

7

8

9

1

Total_Population

227

0

0

0

0

0

0

0

0

0

2

Treatment_Population

83

144

0

0

0

0

0

0

0

0

3

Control_Population

144

83

0

0

0

0

0

0

0

0

4

SisterCommune_Population

0

49

84

94

0

0

0

0

0

0

5

NumFarmers_Population

152

46

29

0

0

0

0

0

0

0

6

RevenueIndex_Population

87

79

61

0

0

0

0

0

0

0

7

EducationIndex_Population

72

122

33

0

0

0

0

0

0

0

8

LongevityIndex_Population

95

52

80

0

0

0

0

0

0

0

9

Precip_Population

32

143

52

0

0

0

0

0

0

0

10

Temp_Population

62

132

33

0

0

0

0

0

0

0

11

Elev_Population

45

134

48

0

0

0

0

0

0

0

12

Yield_Population

79

94

54

0

0

0

0

0

0

0

13

VPI_Population

95

81

51

0

0

0

0

0

0

0

 

 


[Omitted] Table 4. Expected Sample Frequencies, Simple Random Sampling

 

Table 5. Desired Sample Frequencies before Matching

 

Field Definitions:

IDNO: Record identification number

Name: Variable name & _Sample_Desired

Last 10 columns: stratum code

 

IDNO

Name

0

1

2

3

4

5

6

7

8

9

1

Total_Sample_Desired

20

0

0

0

0

0

0

0

0

0

2

Treatment_Sample_Desired

0

0

0

0

0

0

0

0

0

0

3

Control_Sample_Desired

0

0

0

0

0

0

0

0

0

0

4

SisterCommune_Sample_Desired

0

7

6

7

0

0

0

0

0

0

5

NumFarmers_Sample_Desired

7

6

7

0

0

0

0

0

0

0

6

RevenueIndex_Sample_Desired

7

6

7

0

0

0

0

0

0

0

7

EducationIndex_Sample_Desired

7

6

7

0

0

0

0

0

0

0

8

LongevityIndex_Sample_Desired

7

6

7

0

0

0

0

0

0

0

9

Precip_Sample_Desired

7

6

7

0

0

0

0

0

0

0

10

Temp_Sample_Desired

7

6

7

0

0

0

0

0

0

0

11

Elev_Sample_Desired

7

6

7

0

0

0

0

0

0

0

12

Yield_Sample_Desired

7

6

7

0

0

0

0

0

0

0

13

VPI_Sample_Desired

7

6

7

0

0

0

0

0

0

0

 

 


[Omitted] Table 6. Expected Sample Frequencies before Matching

 

Table 7. Actual Sample Frequencies before Matching

 

Field Definitions:

IDNO: Record identification number

Name: Variable name & _Sample_ActualBeforeMatching

Last 10 columns: stratum code

 

IDNO

Name

0

1

2

3

4

5

6

7

8

9

1

Total_Sample_ActualBeforeMatching

20

0

0

0

0

0

0

0

0

0

2

Treatment_Sample_ActualBeforeMatching

0

20

0

0

0

0

0

0

0

0

3

Control_Sample_ActualBeforeMatching

20

0

0

0

0

0

0

0

0

0

4

SisterCommune_Sample_ActualBeforeMatching

0

5

5

10

0

0

0

0

0

0

5

NumFarmers_Sample_ActualBeforeMatching

13

3

4

0

0

0

0

0

0

0

6

RevenueIndex_Sample_ActualBeforeMatching

5

10

5

0

0

0

0

0

0

0

7

EducationIndex_Sample_ActualBeforeMatching

3

12

5

0

0

0

0

0

0

0

8

LongevityIndex_Sample_ActualBeforeMatching

7

5

8

0

0

0

0

0

0

0

9

Precip_Sample_ActualBeforeMatching

5

9

6

0

0

0

0

0

0

0

10

Temp_Sample_ActualBeforeMatching

5

10

5

0

0

0

0

0

0

0

11

Elev_Sample_ActualBeforeMatching

5

10

5

0

0

0

0

0

0

0

12

Yield_Sample_ActualBeforeMatching

3

10

7

0

0

0

0

0

0

0

13

VPI_Sample_ActualBeforeMatching

7

12

1

0

0

0

0

0

0

0

 

 


[Omitted] Table 8. Desired Sampling Fractions before Matching

 

Table 9. Actual Sample Frequencies after Matching

 

Field Definitions:

IDNO: Record identification number

Name: Variable name & _Sample_ActualAfterMatching

Last 10 columns: stratum code

 

IDNO

Name

0

1

2

3

4

5

6

7

8

9

1

Total_Sample_ActualAfterMatching

40

0

0

0

0

0

0

0

0

0

2

Treatment_Sample_ActualAfterMatching

20

20

0

0

0

0

0

0

0

0

3

Control_Sample_ActualAfterMatching

20

20

0

0

0

0

0

0

0

0

4

SisterCommune_Sample_ActualAfterMatching

0

7

11

22

0

0

0

0

0

0

5

NumFarmers_Sample_ActualAfterMatching

26

6

8

0

0

0

0

0

0

0

6

RevenueIndex_Sample_ActualAfterMatching

17

12

11

0

0

0

0

0

0

0

7

EducationIndex_Sample_ActualAfterMatching

15

20

5

0

0

0

0

0

0

0

8

LongevityIndex_Sample_ActualAfterMatching

15

5

20

0

0

0

0

0

0

0

9

Precip_Sample_ActualAfterMatching

8

24

8

0

0

0

0

0

0

0

10

Temp_Sample_ActualAfterMatching

10

24

6

0

0

0

0

0

0

0

11

Elev_Sample_ActualAfterMatching

5

29

6

0

0

0

0

0

0

0

12

Yield_Sample_ActualAfterMatching

11

22

7

0

0

0

0

0

0

0

13

VPI_Sample_ActualAfterMatching

21

17

2

0

0

0

0

0

0

0

 

 


Table 10. Table Showing Matched-Pair Details

 

Field Definitions:

IDNO: Record identification number

Treatment: treatment indicator variable. 1: treatment village; 2: control village

InSample: Sample-unit code.  1: treatment village; 2: control village

MatchSetNumber: Matched-pair identifier. Desired sample of 16 matched pairs plus 4 replacement pairs.  Select 16 matched pairs in order of this variable, i.e., starting from the top of the list.

The remaining columns are coded values of the match variables.  Note the similarity of values of the match variables for each sample pair (many exact matches, and those not matching exactly off by one).

 

IDNO

Treatment

InSample

MatchSetNumber

SisterCommune

NumFarmers

RevenueIndex

EducationIndex

LongevityIndex

Precip

Temp

Elev

Yield

VPI

4

1

1

1

1

0

0

1

1

2

1

1

0

1

1

0

2

1

2

0

2

1

0

1

1

1

0

1

2

1

1

2

2

0

1

2

0

0

2

0

1

1

152

0

2

2

2

0

2

1

0

0

1

1

0

1

3

1

1

3

3

2

2

0

2

1

1

1

2

0

10

0

2

3

3

2

0

0

2

1

1

1

1

0

16

1

1

4

1

0

0

1

1

2

0

1

2

0

5

0

2

4

3

0

0

0

2

1

0

1

1

0

6

1

1

5

2

0

1

2

0

0

2

0

1

1

197

0

2

5

2

0

2

1

0

0

1

1

0

1

7

1

1

6

3

1

1

1

2

1

1

2

1

1

8

0

2

6

3

1

0

0

2

1

1

1

1

0

9

1

1

7

1

0

0

1

1

2

0

1

0

0

59

0

2

7

1

0

1

1

0

2

0

2

0

0

215

0

2

8

2

0

2

1

0

1

2

1

0

1

11

1

1

8

2

0

1

2

0

0

2

0

1

1

65

0

2

9

3

1

0

0

2

1

1

1

1

0

12

1

1

9

3

1

2

1

0

1

1

1

2

1

38

0

2

10

3

2

0

0

2

1

1

1

1

0

13

1

1

10

3

2

2

0

2

1

1

1

2

0

14

0

2

11

2

0

2

1

0

0

1

1

0

0

21

1

1

11

2

1

1

2

0

0

2

0

1

0

19

1

1

15

1

0

0

1

1

2

0

1

2

0

129

0

2

15

3

0

0

0

2

1

0

1

1

0

40

0

2

16

1

0

1

1

0

2

0

1

0

2

20

1

1

16

1

0

0

1

1

2

0

1

0

2

25

0

2

17

3

0

0

0

2

1

1

1

1

0

22

1

1

17

3

0

1

1

2

1

1

2

1

1

30

1

1

18

2

0

1

2

0

0

2

0

1

1

23

0

2

18

2

0

2

1

0

1

1

1

0

1

26

1

1

20

3

0

1

1

2

1

1

2

1

1

32

0

2

20

3

0

0

0

2

1

1

1

1

0

27

1

1

21

3

0

1

1

2

2

0

2

1

1

131

0

2

21

3

1

0

0

2

1

0

1

1

0

28

1

1

22

3

0

1

1

2

1

1

2

1

1

34

0

2

22

3

0

0

0

2

1

1

1

1

0

29

1

1

23

3

2

2

1

0

1

1

1

2

1

87

0

2

23

3

2

0

0

2

1

1

1

1

0

44

1

1

33

3

2

2

0

2

1

1

1

2

0

191

0

2

33

3

2

0

0

2

1

1

1

1

0

 


Table 11. Village Sample for the COMPACI Benin Survey

 

Field Definitions:

ID: Record identification number

IDNO: Record identification number

InSample: Sample-unit code.  1: treatment village; 2: control village

NNMatchCode: ID of matching unit

MatchSetNumber: Matched-pair identifier (Desired sample of 16 matched pairs plus 4 replacement pairs.  Select 16 matched pairs in order of this variable, i.e., starting from the top of the list.)

Treatment: treatment indicator variable. 1: treatment village; 2: control village

Other field (column) names as defined in text.

 

ID

IDNO

InSample

NNMatchCode

MatchSetNumber

Treatment

Departement

Commune

Village

Lat

Long

17

4

1

191

1

1

DONGA

DJOUGOU

BORTOKO

9.8

1.9

191

1

2

17

1

0

ATACORA

Tanguieta

YANGOU

 

 

127

2

1

166

2

1

ATACORA

MATERI

SINDORI

10.8065185547

1.03497314453

166

152

2

127

2

0

ATACORA

Tanguieta

KOLEGOU

10.883333

1.483333

91

3

1

217

3

1

ATACORA

KEROU

SINANGOUROU

10.6666870117

1.98101806641

217

10

2

91

3

0

BORGOU

Sinende

SEKERE

10.45

2.45

66

16

1

224

4

1

DONGA

DJOUGOU RURAL

DONPARGO TCHORI

9.7529296875

1.80017089844

224

5

2

66

4

0

BORGOU

Sinende

TOUME

10.083333

2.466667

115

6

1

181

5

1

ATACORA

MATERI

DABABOUN

10.8065185547

1.03497314453

181

197

2

115

5

0

ATACORA

Tanguieta

SANGOU

10.85

1.466667

96

7

1

200

6

1

ATACORA

KOUANDE

BECHET

10.4575195313

1.71392822266

200

8

2

96

6

0

BORGOU

Sinende

GAKPEROU

 

 

45

9

1

148

7

1

DONGA

DJOUGOU

KOUA II

9.75

1.766667

148

59

2

45

7

0

DONGA

Copargo

COPARGO

9.8375

1.548056

184

215

2

122

8

0

ATACORA

Tanguieta

TAMBOGRE

10.683333

1.333333

122

11

1

184

8

1

ATACORA

MATERI

KPEREHOUN

10.8065185547

1.03497314453

209

65

2

142

9

0

BORGOU

Sinende

KPARO

 

 

142

12

1

209

9

1

ATACORA

PEHUNCO

SOASSARAROU

10.3024902344

2.03790283203

215

38

2

33

10

0

BORGOU

Sinende

NIARO

10.366667

2.4

33

13

1

215

10

1

ATACORA

KEROU

BEREKOSSOU

10.7

2.083333

160

14

2

19

11

0

ATACORA

Tanguieta

BATIA

10.905833

1.489722

19

21

1

160

11

1

ATACORA

MATERI

KONEHANDRI

10.933333

1

72

19

1

201

15

1

DONGA

DJOUGOU RURAL

KPAMALAGOU

9.7529296875

1.80017089844

201

129

2

72

15

0

BORGOU

Sinende

GAMAGUI

10.15

2.433333

145

40

2

15

16

0

DONGA

Copargo

ANANDANA

9.955278

1.388889

15

20

1

145

16

1

DONGA

DJOUGOU

BOUGOU I

9.433333

1.6166666

213

25

2

106

17

0

BORGOU

Sinende

MONSI

 

 

106

22

1

213

17

1

ATACORA

KOUANDE

DANRI

10.4575195313

1.71392822266

130

30

1

185

18

1

ATACORA

MATERI

TETONGA

10.8065185547

1.03497314453

185

23

2

130

18

0

ATACORA

Tanguieta

TANONGOU

10.816667

1.433333

100

26

1

193

20

1

ATACORA

KOUANDE

KETERE

10.4575195313

1.71392822266

193

32

2

100

20

0

BORGOU

Sinende

BONBONROU

 

 

20

27

1

218

21

1

ATACORA

KOUANDE

BOROKOU PEULH

10.33333

1.65

218

131

2

20

21

0

BORGOU

Sinende

SEKOKPAROU

10.066667

2.35

102

28

1

210

22

1

ATACORA

KOUANDE

PELINA

10.4575195313

1.71392822266

210

34

2

102

22

0

BORGOU

Sinende

KPENATI

 

 

144

29

1

204

23

1

ATACORA

PEHUNCO

BEKET

10.3024902344

2.03790283203

204

87

2

144

23

0

BORGOU

Sinende

GUESSOU-BANI

10.333333

2.266667

93

44

1

227

33

1

ATACORA

KEROU

YAKRIGOROU

10.6666870117

1.98101806641

227

191

2

93

33

0

BORGOU

Sinende

YARRA

10.510278

2.474444

 

 

Supplemental Material (not part of briefing)

 

15. Features of Descriptive Sample Surveys

 

Survey Purpose / Goal

 

The goal is to estimate overall features of the population being surveyed.  Specifically, to make estimates of means and totals for the surveyed population and population subgroups.

 

It is not the intent to make inferences about the process that generated the surveyed population (e.g., about an underlying infinite population of which the surveyed population is a single sample; or of a process by which subjects are selected for treatment).

 

That is, it is not the goal to assess causal impact of certain variables on other variables – just measure probabilistic associations, not causal relationships.

 

15b. Features of Descriptive Sample Surveys (Cont’d.)

 

Design Features

 

Design goal is to obtain estimates having a desired level of precision for estimates.

 

Complex survey designs include clustering, multistage sampling, stratification; sampling with and without replacement; selection of higher-level units with equal probabilities or probabilities proportional to size.

 

Sample-size determination based on precision analysis, not power analysis.

 

Finite population correction (FPC) applies to variance estimates for non-replacement sampling.

 

Optimal survey design based on Neyman allocation (allocate more sample units to strata where variability is higher and cost of sampling is lower).

 

Selection probabilities are generally constant within strata.  Stratification is cross-stratification, not marginal stratification.  There are few variables of stratification (e.g., controlled selection for two variables).

 

15c. Features of Descriptive Sample Surveys (Cont’d.)

 

Analysis Features

 

Estimates are design-based, design-unbiased, design-consistent.

 

Data analysis is straightforward, often using closed-form expressions for estimation of means and variances.

 

Estimates for a particular population subgroup are “direct” estimates, based only on the selection probabilities and sample data (including auxiliary data) for that subgroup.  There may be some use of model-assisted analysis, as in treatment of nonresponse or in production of small-area statistics.

 

Tests of hypotheses are irrelevant for finite populations (any two finite-population groups will almost always have different means and totals).

 

16. Features of Analytical Sample Surveys

 

Survey Purpose / Goal

 

The goal is to obtain information for program evaluation and policy analysis, i.e., about the causal effects of certain variables (input variables, control variables) on other variables (output variables).

 

The focus is on estimation of the process that generated the population data, not on estimation of overall features of the population.  Examples: Effects of a farmer-training program; effects of a change in a farm subsidy program; effects of changes in recruitment policies.

 

16b. Features of Analytical Sample Surveys (Cont’d.)

 

Relationship of Analytical Survey Design to Experimental Design

 

The standard approach to estimation of impact is experimental design.  (An experiment involves controlled variation in certain variables, such as assignment to treatment using randomization.  Data from an experiment are experimental data.  Passively observed data are observational data.)

 

Key features of experimental design (ED) are randomization, replication, local control (blocking, control groups, matched pairs) and symmetry (balance, orthogonality).

 

It is attempted to incorporate the features of ED into analytical sample surveys.  In most sample-survey applications, however, it is not feasible to do this.

 

Randomization may not be possible for a variety of reasons, such as program eligibility, legal constraints, ethical constraints, self-selection, or physical constraints (few or no population elements for certain combinations of variables).

 

In ED applications, the levels of control variables are generally easily specified, such as setting temperature, concentration, times, dosing.  In sample-survey applications it is generally difficult to effect local control (control groups, matched pairs) and symmetry (cross-stratification to achieve balance and orthogonality for a number of control variables).

 

Lacking the methodology of ED, it is necessary to adopt a different approach to causal inference in sample-survey applications.  This is the topic of causal inference (causal modeling and analysis).

 

16c. Features of Analytical Sample Surveys (Cont’d.)

 

Design Features:

 

Major features of analytical survey designs are:

 

Overall similarity to experimental design (e.g., randomization (to the extent possible), replication, symmetry, local control (control groups, comparison groups, matched pairs)

 

Incorporation of all aspects of descriptive survey design (clustering, multistage sampling, stratification)

 

Comparison groups are formed by matching, if randomization is not feasible.  Matched pairs may be formed after treatment has already been assigned.  The matching is based on a causal model.

 

Design features are incorporated that cause unobservable variables affecting selection and assignment to treatment to drop out (e.g., inclusion of the same respondents in successive survey rounds).

 

16d. Features of Analytical Sample Surveys (Cont’d.)

 

Analysis Features

 

Assess estimability / identifiability of estimates of causal effects relative to a causal model.

 

Reduce selection bias by incorporating the estimated propensity score into the analysis (by stratification, inverse weighting and regression).

 

Eliminate selection bias associated with unobserved variables (e.g., respondent characteristics) by using difference estimators that cause these variables to drop out.

 

Use both fixed-effects and random-effects estimators, use Hausman test to determine which.

 

17. Differences between Descriptive and Analytical Surveys

 

Differences in Purpose

 

Descriptive surveys are concerned with measurement of observed means, totals and correlations.

 

Analytical surveys are concerned with measurement of causal effects.  (An observed treatment effect (OTE) may be quite different from an average treatment effect (ATE).)

 

Differences in Design

 

For descriptive surveys, generally avoid correlation among sample units.

 

Want low intra-cluster correlation in order to keep the “cluster effect” (loss of precision caused by high intracluster correlation) low.

 

For longitudinal descriptive surveys, want low correlation between panels (successive survey rounds).

 

For analytical surveys, generally introduce correlations, such as in using the same respondents in successive survey panels, to increase the precision of differences (and regression coefficients).

 

For descriptive surveys, the FPC enables the use of smaller samples to achieve a specified level of precision.

 

For analytical surveys, the FPC is not relevant.

 

For descriptive surveys, it is generally desired to keep the unit selection probabilities relatively uniform (for high precision of means and totals).

 

For analytical surveys, design requirements will generally impose substantial variation in the selection probabilities.

 

Construct the design to remove the effects of unobserved variables that might be correlated with model explanatory variables (such as selection for treatment) (e.g., by including the same respondents in both survey waves, and using a difference estimator).

 

17b. Differences between Descriptive and Analytical Surveys (Cont’d.)

 

Differences in Analysis

 

For descriptive surveys, standard statistical-analysis software (e.g., SAS, Stata, SPSS and many others) can be used to quickly produce design-based estimates of quantities of interest (means and totals) for survey data.

 

For analytical surveys, modules are available in these packages to estimate quantities of interest (such as program impacts), but much custom-tailored work is required to construct model-based estimates (econometric modeling).

 

Multiple estimation procedures (full-information maximum likelihood, limited-information maximum likelihood, ordinary least squares, indirect least-squares, two-stage least squares, instrumental variables)

Most model variables must be considered random effects, not fixed effects.

Fixed-effect (FE) and random effect (RE) estimators

Estimability / identifiability must be determined (not just relative to a statistical (associational) model (exogeneity conditions, rank and order conditions, parameter restrictions / exclusions), but relative to a causal model (Pearl’s front-door and back-door tests)).

Model specification tests, such as the Hausman test

Closed-form expressions are generally not available for variance estimates.  Resampling methods (“bootstrapping”) are used to estimate variances and significance levels

Statistical model and estimates must be consistent with the causal model (e.g., endogeneity).

Propensity-score models are used to reduce selection bias (stratification, inverse weighting and regression using propensity scores).