Impact evaluations are supposed to tell us what works in development, and a lot of time and money goes into them. It’s unfortunate, then, when they fail to report their results clearly. One of the things I found most shocking, looking through a large database of impact evaluations, was how often academic papers omitted information that is critical for interpreting the study’s results and figuring out how well they might apply to other contexts. This blog post draws on data from over 400 studies that AidGrade found in the course of its meta-analyses. Here are five embarrassing things many papers neglect to report:

1) Attrition

It’s normal for some people to drop out of a study. It can pose a problem, however, if attrition is not equal between the treatment group and the control group, as this self-selection process could bias the study’s results. While attrition is very well-known to be something one ought report, only about 75% of papers reported it.

2) The standard deviation of key variables

Without knowing how much variation there is in an outcome variable, it’s hard to know whether a paper found a relatively high or relatively low effect. Why? Often studies report outcomes that use scales particular to the paper, for example, reporting scores on a certain academic test. There is no way to compare these results across different papers using different tests unless you standardize the data – then you can at least say that program A was found to affect test scores by 0.1 standard deviations, while program B found an effect size of 0.2 standard deviations.

3) Whether the results include people who did not take advantage of the program

Intent-to-treat (ITT) estimates consider an intervention’s effects on everyone assigned to receive treatment, regardless of whether or not they actually took advantage of the program. The alternative is to estimate the treatment effect on the treated (TOT). For example, suppose that only 10% of people who were offered a bed net used it, and suppose bed nets were 90% effective at preventing malaria. The TOT estimate would be 90% – the ITT estimate, 9%. Clearly, if the authors don’t take care to explain which they are reporting, we really don’t know how to interpret the results!

4) Characteristics of the context of the intervention

Are the people in your study rich or poor? It could affect how well they respond to a cash transfer. Does your intervention aim to decrease an infectious disease? It probably matters what the underlying infection rate is within the population, especially if people can catch it from each other. When did the intervention start and end relative to data collection? It is difficult to know what results mean without knowing much about the people in the study, and it makes comparing results across different settings even more difficult.

5) Comparable outcome variables

Finally, papers seem to “run away from each other” in terms of which outcome variables they cover. If one paper addresses the effect of HIV/AIDS education on the incidence of the disease, another will focus on whether people got tested. It makes sense given the incentives of the researchers to be the first ones to show a particular result and to differentiate their findings. However, a single paper’s result cannot tell us how general it is. For that, you need more studies, and in order to compare those studies, they need to have outcome variables that are as comparable as possible.

Better reporting is not an impossible problem to solve. The Experiments in Governance and Politics network (EGAP), for example, decided to fund projects clustered around comparable intervention and outcome measures. In psychology, it was journals that started demanding better reporting. Something similar should happen in economics to provide researchers with the right incentives to maximize the use of their studies.

UPDATE: Monday March 17, 2014 5:08pm World Bank responds (see end of this post)

WARNING: the contents of this message are for private entertainment purposes only. Any unauthorized duplication of this message to score cheap points is strictly prohibited.

Email from World Bank, January 27:

I am writing to you in reference to a recent publication: “The Tyranny of Experts: Economists, Dictators, and the Forgotten Rights of the Poor” by William Easterly.
As part of our high priority events, we’d like to invite the author for a book signing event…  

The events program has hosted internationally renowned speakers including:  Amartya Sen, Angus Deaton…Christy Turlington … as well as numerous Heads of States and Nobel Laureates. 

Email from World Bank, February 5:

I am happy to confirm the event on March 18 from 12-2pm.

Could you please also send me a copy of the book, so we can provide it to a potential moderator.

Email from World Bank, February 6:

We are delighted and look forward to a great and exciting event on March 18. The event will be inside the main Preston auditorium (1818 H Street NW). 

Would it also be possible to send me a galley of the book? 

Email from World Bank, February 13:

Thank you very much for arranging the World Bank book event with Professor Easterly on “The Tyranny of Experts” for March 18, we very much appreciate it. We would like to convey our sincerest apologies though as we have inadvertently overbooked ourselves and have overlapping events that day. Given the large number of high-profile events our very small team is handling, we overlooked and provided you with this date prematurely. We will shortly come back to you with new dates so we may find a mutually suitable one.

February 27 In response to inquiry about rescheduling, World Bank emails back that they hope to work together again at some point in the future.

March 17 World Bank response: Asked to comment on this post last Friday, David Theis, Chief of Media Relations at the World Bank responded with this statement at 5pm, Monday March 17 (a snow day in DC):

“I have confirmed that we indeed had a double booking, so apologies for the scheduling mix-up. We would be more than happy to have you at the Bank and will be in touch to find a date. Sorry for the inconvenience.”

 

“Evidence-based policies” are in vogue. But how do you synthesize the evidence base? People often engage in “vote counting”: reading the literature and consciously or subconsciously summing up the number of findings for a positive effect, a negative effect, or no effect for a particular program. The group with the greatest number wins.

Unfortunately, vote counting is not an ideal method to synthesize the evidence. The biggest problem is that some “no effect” papers were unlikely to find an effect even if there was one. Many studies in development use too small of a sample to be likely to find an effect, so the fact that their results are insignificant is not actually all that informative.

An alternative technique, meta-analysis, can aggregate many insignificant findings and sometimes transform them into a jointly significant result. It also allows flexibility in weighting studies differently, since all studies are not equal.

In most of the cases in which vote counting and meta-analysis diverge, vote counting reports an insignificant result and meta-analysis reports a significant positive result. For example, both conditional and unconditional cash transfer programs often had several “no effect” results — “cash transfers don’t work!” These types of programs have effects on a very broad range of outcomes, but because some or all of them are only tangentially related to the intervention, it’s harder to see an effect in any one study. But if you aggregate the insignificant results on labour force participation, grade promotion or test scores through meta-analysis then they become significant — “cash transfers work!”

The error of overstating the strength of “no effect” results through vote counting is all the worse given that “no effect” does not really mean no effect. The common misconception is that failure to reject the null hypothesis of no effect means we have accepted the null hypothesis of no effect, but that is simply untrue. Absence of a positive finding becomes a finding of absent effect, but this is not what the test says. Perhaps with a bit more data the result would become significant.

How big is this problem? Preliminary analysis of a database I have assembled of development studies, through a group called AidGrade, suggests that the meta-analysis results for a particular intervention-outcome combination diverge from the results that would have been obtained using vote counting about a third of the time. Vote counting actually gives very similar results as to what one would get by just looking at a single paper selected at random from the entire literature; not a great foundation on which to base policy recommendations. If we want to use rigorous evidence, we have to be rigorous about how we use rigorous evidence.

AudienceTOE

Last Monday we had the pleasure of hosting a few of our closest friends at Cooper Union’s Great Hall to celebrate the launch of Professor Easterly’s new book, The Tyranny of Experts: Economists, Dictators, and the Forgotten Rights of the Poor. Paul Romer gave a gracious introduction, and many audience members had the chance to question Bill’s audacious theories in a Q&A at the end of the lecture. Below are just a few selected clips from the evening (Paul’s introduction, Bill on his membership in Authoritarians Anonymous, and his answer to the perennial favorite question: “But What Can I Do?”). To hear more, take a look at the author’s speaking schedule for the next few months which will take him to Boston, DC, the West Coast and London, and of course, read the book.

 

Tyranny of Experts Book Launch from NYU Devt Research Institute on Vimeo.

Photo courtesy of Jessica Kane. See more photographs from the launch here.

NPR’s The Takeaway asks in an interview with one of our local troublemakers this week, are billionaire philanthropists the true champions in the fight against poverty? Listen to at least part of the audio to get the tone of the critique, as well as read the selected transcripts below.

Bill Easterly: I have nothing to take away from the billionaires who are very generous, who are spending on the poor rather than on private jets – that’s great. But what can actually happen is they can also have too much influence on the way we see the whole problem of global poverty.


Gates has this ‘great man’ approach to development in which he sees great national leaders and great philanthropists like himself doing all the good things that happen. Unfortunately, the Ethiopian government…that he praised a year ago in his Annual Letter from his Foundation is not doing great things. He is very naïve to think that the government is benevolent and is actually contributing to development. They actually are serial human rights abusers that are destroying development.


Before Gates’ annual letter… there was a peaceful blogger named Eskinder Nega who was sentenced to eighteen years in prison simply for advocating more democracy in Ethiopia, for writing about the Arab Democratic Spring.

This kind of democratic activism is what you need to make government leaders benevolent. If you think of our own Chris Christie scandal on the bridge – that’s the sign of democracy working, that we keep Chris Christie from doing something bad. He’ll never do it again. No other governor will ever do it again.

John Hockenberry (host): Can’t you make an argument that you want to be separate from politics?  The United Nations and many NGOs try to stay out. For instance, CARE and the Red Cross are completely independent from politics. [They] go into Ethiopia regardless of what the government is doing and get access because of their objectivity, or their detachment from politics.

Bill Easterly: That’s the perpetual temptation in poverty reduction: to think you can do something that’s technically pure that’s free from politics. Unfortunately that’s a delusion. Let me give you one example of that. Famine relief you might think is as a-political it can get. But unfortunately to go back to Ethiopia, the same government Gates was praising was caught red-handed using famine relief to only give it to the supporters of the ruling party. They denied it to the opposition party members. They were starving the opposition – in the middle of a famine they were rewarding their own supporters and staying in power by that means.

Listen to the full program at The Takeaway here.