Replication is an important tool for research credibility. In this interview, BITSS Program Manager Jo Weech interviewed impact evaluation experts Sridevi Prasad and Douglas Glandon on their replication checklist, designed to help standardize replications for impact evaluations and other social science studies.
As the political landscape and funding for scientific research is evolving, it is more important than ever to broadcast scientific advances and tools for research integrity. Open science tools and methods, adopted primarily in the biomedical sciences, are increasingly being adopted in other research disciplines. An important goal of open science is to increase reproducibility. When research findings don’t replicate, it can call their validity into question. Replicating studies is an important element of ensuring that scientific findings are rigorous and credible. Researchers can take steps to make it easier for others to attempt to replicate their findings. Increasingly, donors and journals are implementing new requirements aimed to increase research transparency and replicability.
However, replications still make up a small number of published academic papers. This can make the authors of studies that are being replicated feel that the replication of their work is targeted, or an attempt to find mistakes instead of an effort to verify the results. Introducing standardized procedures for replication studies can make them easier to perform, introduce standards, and make the process more transparent for the replicators and for original authors.
Sridevi Prasad and Douglas Glandon are two impact evaluation experts who are thinking about how to make replications easier. I interviewed them in early January to discuss their experience developing a checklist for replicating impact evaluations, which can also be applied more broadly in social science research. Their checklist was published in the Journal of Development Effectiveness and can be accessed online here. As Prasad noted, their checklist is fundamentally “a communication tool” that helps researchers “bring people onto the same page” by creating a common language across disciplines.
Introducing the Impact Evaluation Experts

Sridevi Prasad is an evaluation expert with a remarkable track record in causal inference research. Her work spanned critical areas of water, sanitation, and health interventions across Ghana, Kenya, and Bangladesh. Currently pursuing her PhD, Sridevi is developing innovative strategies to promote electric cook stoves and reduce household air pollution – embodying the practical, impact-driven research approach she champions.

Douglas Glandon has led or supported over a dozen impact evaluations and systematic reviews – and peer-reviewed dozens more – spanning a range of sectors, including health, nutrition, agriculture, energy, economic recovery, and others. This paper emerged from Douglas’ role as Leader of Methods Development at 3ie, which focused on advancing and refining the organization’s methodological approaches to evidence generation (including expanding beyond counterfactual-based impact evaluation), synthesis, and application in policymaking. Douglas is now Lead Technical Advisor at the World Bank’s Global Evaluation Initiative.
Designing the Replication Checklist
Since it began, the International Initiative for Impact Evaluation (3ie) has prioritized transparency, reproducibility, and ethical evidence (TREE) in its work. Recognizing that replication has not been fully embraced by the research community or development donors, 3ie funded the replication of influential impact evaluations through its Replication Program, including impact evaluations of HIV programs. It was Sridevi’s and Douglas’ experience working together on one such replication study that prompted them to create the checklist. In an early stage discussion of the replication design, they wondered aloud whether the authors of the original study would think their analysis plan was fair. Douglas proposed a structure and approach for a checklist that would make their choices defensible and transparent and that could be extended to future replications. Without skipping a beat, Sridevi was all in.
Sridevi and Douglas had seen through their own experience that replication research, though critical, was often viewed skeptically. Researchers sometimes perceived replication efforts as attempts to discredit their work rather than as tools to strengthen their findings. Douglas explained, “We wanted to create a systematic guide that not only helped replication researchers design their studies but also communicated the rationale behind their choices to original authors. This transparency fosters trust and collaboration.”
The development of the checklist followed a multi-step process. Following Douglas’ conceptualization of the project, Sridevi and Douglas started with the framework provided in Brown and Wood’s 2018 article, which offered a broad taxonomy of replication tips organized into four groups: validity of assumptions, data transformations, estimation methods, and heterogeneous impacts. With Sridevi’s leadership and Douglas’ technical guidance, the research team (also including Fiona Kastel, Suvarna Pande, and Chenxiao Zhang) expanded this framework by separating assumptions by identification strategy and including additional checks based on their review of existing replications.
They focused on six common impact evaluation designs: randomized controlled trials, difference-in-differences, instrumental variables analysis, interrupted time series, matching, and regression discontinuity design. They first compiled resources for these six methodologies, then conducted unstructured keyword searches in Google Scholar and used backwards citation tracking to identify additional relevant materials. This process yielded 31 key resources that gave them a basis of information on impact evaluations to design the checklist around.
They then turned to 3ie’s database of replication studies, funded between 2014 and 2022, for applicable tools. These replications provided valuable insights into the real-world application of replication methods to impact evaluations, and gave them examples of checks and tests to include in the checklist. As Sridevi found, “When I was first tasked with managing a replication, I lacked a clear structure.” While 3ie had general replication guidelines, there weren’t specific instructions for what to check in various papers. By analyzing these replication studies through the lens of Brown and Wood’s framework and identifying additional categories needed, Sridevi and Douglas created a more comprehensive and applied tool.
Using the Checklist
Sridevi and Douglas designed their replication checklist to strike a balance between standardization and flexibility. “The whole point is not to be like, these are the rules. This is exactly what you have to follow,” Sridevi explained, “but it’s more of a template and a tool for researchers to use.”
The checklist is organized into five main categories:
- Validation of Assumptions/Conditions
- Data Transformations
- Estimation Methods
- Heterogeneous Outcomes
- Standard Checks
Each category contains detailed tables with five columns:
Key Attribute | Recommended Test/Check | Comment | Action | Resources |
What should be checked | How to check it | For replication researchers to note their observations | Specific steps the replication researcher will take | Relevant references and guides |
For example, when checking instrumental variable (IV) assumptions under Category 1: Validation of Assumptions/Conditions, a row might look like this:
Key Attribute | Recommended Test/Check | Comment | Action | Resources |
Relevance condition | Check association between instrument and exposure | Study did not explicitly test this condition even though IV was used | Run t-test to assess association between instrument Z and exposure X | [Relevant methodological papers] |
The checklist’s flexibility is intentional. As Douglas noted, “We try to stick to a relatively modest, broadly applicable set of checklist items… It’s a starting point.” Some sections, like estimation methods, are deliberately broad to accommodate regression techniques and analytical approaches from different fields.
“We’re encouraging people to modify it and adjust it for their own needs,” Sridevi emphasized. “Whether that is adding in more specific analyses that may be relevant for their discipline… they can add their own rows or add whatever they need to make it tailor for their own research.”
Impact Beyond Replication
The replication checklist was designed to be more than just a tool for reproduction studies – it represents a broader approach to research transparency and communication. Douglas emphasized this broader impact: “Every time a story pops up about results being overturned, it pokes another hole in public confidence about science. The most valuable contribution would be signaling upfront to be transparent and systematic about the thinking that goes into methodology.”
The checklist serves multiple purposes throughout the research lifecycle. Sridevi explained, “This is really a communication tool… it helps bring people onto the same page in terms of thinking about [research].” She highlighted its potential use in peer review, noting that “reviewers could use [the checklist] when reviewing a study that’s been submitted to understand both how they may want to structure their comments and make it feel a little bit more objective.”
Douglas compared the checklist’s role to that of a recipe, suggesting it provides a foundation for both standardization and innovation: “If you have a recipe, then you can understand what it will take to produce the dish. Then it becomes easier to think about what you might swap out, how you might change the process.” This flexibility makes the checklist valuable for various research stages, from initial study design to reviewing and replication.
The tool becomes particularly crucial for studies that cannot be replicated due to high costs or unique circumstances. In these cases, thorough documentation of conditions, data transformations, and outcomes becomes essential for accurate interpretation and meta-analysis. Douglas noted, “You can’t just take the findings from a study and run with it as truth. You have to understand the context in which the study was created and how the analysis was done.”
Next Steps for Sridevi and Douglas
Sridevi is embarking on a critical research project in Cambodia, focusing on promoting electric cook stoves to reduce household air pollution. Her dissertation research will apply the transparency and methodological rigor developed through the replication checklist work. She plans to integrate open science practices into her implementation science research, demonstrating how systematic approaches can be applied across different research contexts.
As the lead technical advisor for the World Bank’s Global Evaluation Initiative, Douglas works with a network of organizations around the world that partner with country governments to enhance the systematic generation and use of evidence in their policies and practice, with the aim of improving governance, accountability, and outcomes for their populations and the environment. Transparency in evaluation and research is a critical part of this work, both at the level of individual studies and in the broader systems and processes that govern evidence generation and use.
Papers Referenced
Brown, A. & Wood, B. (2018). Which tests not witch hunts: a diagnostic approach for conducting replication research. Economics, 12(1), 20180053. https://doi.org/10.5018/economics-ejournal.ja.2018-53
Prasad, S. K., Kastel, F., Pande, S., Zhang, C., & Glandon, D. M. (2024). A checklist to guide sensitivity analyses and replications of impact evaluations. Journal of Development Effectiveness, 16(3), 332–348. https://doi.org/10.1080/19439342.2024.2318695