Guest post by Kweku Opoku-Agyemang (Center for Effective Global Action, Cornell Tech, and International Growth Centre)
Improving ethical and transparency standards within and across the social and behavioral science professions appears to be more of a journey than a destination. Thankfully, this is a voyage that has an ever-growing number of travelers.
On October 13-14, I attended and presented at the Association for Integrity and Responsible Leadership in Economics and Associated Professions (AIRLEAP) 2017 economics conference, “An Urgency for Evidence and Transparency in Economic Analysis and Policy” hosted in St. Charles, Missouri. An advantage of the purposefully small conference was that many participants were open to bluntly sharing their subjective experiences and even errors in judgment while doing research.
It was a great couple of days for presentations on research transparency:
- Simon Halliday of Smith College and several others at Project TIER shared lessons learned from training undergraduate students in R and R Markdown for homework and independent projects.
- Benjamin Wood of 3ie also presented various replications his team has been working on, and a relevant checklist for authors.
- Jan Hoeffler of Replication Wiki showed some descriptive evidence that citations might benefit from transparency, which could further incentivize social scientists to embrace transparent and reproducible research methods.
A possible missing piece that was brought up was the question of how to better involve reviewers in the discussion, and how to support important steps like post-publication peer-review. I suggested that one complementary way forward might be for economics journals to consider following the procedures of premier statistics journals, where the reviewers attempt to replicate authors’ work prior to publication and weigh the outcome heavily in publication decisions. The knowledge that replication would play a significant role in publication before the fact might inspire more transparency in many authors. An encouraging sign in this direction is the adoption of pre-publication replication by the Journal of Development Economics.
These conversations set the stage for my own presentation on a work-in-progress, “Behavioral Economists, Human-Computer Interactions and Research Transparency.” Here, I use the term “behavioral economist” slightly differently than is the norm. I have always found it interesting that we, as economists, generally consider most people as being behavioral—making errors in judgment, missing deadlines, not following through on their word, procrastinating, and making mistakes—with the exception of economists ourselves. To me, this lack of rationality is a reason for the lack of research transparency in the economics profession. Obviously, if all people are behavioral, then this must be true of economists as well, and out pops a perfectly clear reason for why so much research does not replicate: economists put more weight on transparency in the present than in the future, where they face self-control, temptation and other problems. To explain how this may occur, I generalize time-inconsistency to graph theory, using powerful tools from pure mathematics and computer science. I found it very helpful to explicitly model the code underlying research project as a network, and endow the system with time-inconsistency.
The economist doing his/her research, then, can be thought of as someone literally navigating a garden of forking paths, moving from node to node on one network, but taking “detours” when behavioral shortcomings kick in. This representation essentially mimics what statistician Andrew Gelman and others call the “garden of forking paths”. The garden of forking paths, according to Gelman, simply means that we make so many conscious and subconscious decisions while working with our data that, even with the best of intentions, it is extremely unlikely that our work will be entirely replicable.
While it is encouraging that a growing number of journals are requiring authors to provide all data and code, the conference audience agreed that the code submitted to a journal typically represents a “selection bias” of sorts, relative to the universe of queries actually implemented on the data. Thus, just replicating what authors provide is a frightfully low bar for transparency because we never know what non-submitted queries were run on a dataset and how these queries may have changed the outcome of the research. If there was a way to have access to all queries made during a study, much progress could be made in making observational work transparent.
Obviously, it will be critical for pre-analysis plans to become more widespread in order for randomized controlled trials to become more transparent, and BITSS will continue to be an excellent resource for all social scientists interested in open science. However, making observational social science more transparent remains an open question. Fiona Burlig has a great 2016 paper on some prescriptive protocols for making observational data research more transparent. And Garret Christensen and Edward Miguel richly document the many difficulties of reconciling pre-analysis plans with observational data, since the ability to download data can put really the “pre” in pre-analysis plans in question. Did the authors really prepare their pre-analysis plan before getting data access? Even more disconcerting: can we really know if a replicator has downloaded data from a journal’s website after writing their pre-analysis plan? The stakes are high. The level of sheer willpower and effort scholars in economics have exerted towards gaining credible causal inferences from observational data over the years will have to be similarly applied towards making such work more transparent.
A solution I shared at the conference was inspired by recent developments in computer science and human-computer interaction. Trusted timestamping is a field in computer and information science that builds on the blockchain technology (used mainly in cryptocurrencies) that can conclusively determine that a query to software did not exist until a certain point in time, in a verifiable way that cannot be manipulated, even by the author. I discussed how an AI-driven cloud-based system based on trusted timestamping and other innovations would be able to collect all of the queries accompanying a research project in real-time while economists do their research as usual. Essentially, authors would be able to prove that a pre-analysis plan was written before any queries on statistical software were run. Theoretically, this would allow economists working on observational data to benefit from pre-analysis plans in the ways that experimental economists already do. The use of AI sidesteps many concerns, but I hasten to add that the software solution is still a work in progress, and that a person with truly nefarious intentions will always be difficult to rule out entirely. To preserve the integrity of pre-analysis plans, it is of first-order importance that any solution claiming to build on it be implemented only when it is as credible as humanly possible.
Not all of economics research can (or should be, in my view) be automated, but my suggested solutions aim to help us at least know where subjective choices are being made and be able to flag them in a transparent and precise way. Understanding human-computer interactions, I believe, will be critical to helping the next-generation of economists and behavioral scientists on their journey to making research transparent. This step may help our discussions around transparency become more inclusive over time, and perhaps make discussions feel less subjective — our migration on the path towards stronger ethics in the social sciences is only beginning!
About the author: Kweku Opoku-Agyemang is a research fellow affiliated with the Center for Effective Global Action of UC Berkeley and the International Growth Centre. His research interests include development economics, political economy, behavioral economics, research transparency and human-computer interaction.
Hi Kweku, this is an interesting discourse on transparent research, which indeed anchors on what the researcher ‘prefers’ or what makes ‘sense’ in the context. This can’t be automated nor airtight regulated but every attempt towards ensuring credible evidence is fundamental.