What makes a good implementing partner? — Running Randomized Evaluations: A Practical Guide

Unlike most academic economic research, running randomized control trials (RCTs) often involves intense collaboration between researchers and the organization or individuals who are implementing the intervention that is being evaluated. This collaboration can be the best thing about working on a study--or the worst. What should a researcher look for in an implementer? In a later post I will discuss what a researcher can do to strengthen the relationship and provide value to the implementing organization.

i) Sufficient scale

A first, and easy, filter for a good implementing partner is whether an organization is working at a big enough scale to be able to generate a sample size that will provide enough power for the experiment. How big is sufficient depends on the level at which the randomization is going to take place, as well as the number of different variants of the program that are going to be compared and the outcome of interest. Thus a lot of detailed discussion takes place about what a potential evaluation would look like before it is possible to say if an evaluation is feasible. However, it is surprising how many potential partnerships can be ruled out quite early on because the implementer is just not working at a big enough scale to make a decent evaluation possible.

ii) Flexibility

A willingness to try different versions of the program and adapt elements in response to discussions with researchers makes an attractive implementing partner. As discussed above, we can learn a lot by testing different parts of a program together and separately or by comparing different approaches to the same problem against each other. The best partnerships are where researcher and implementer work together to find the most interesting versions of the program to test.

iii) Technical programmatic expertise, yet representative

There is a risk of testing a program run by an inexperienced implementer, finding a null result, and generating the response, “Of course there was no impact, you worked with an inexperienced implementer.” The researcher also has less to learn from an inexperienced implementer, and thus the partnership risks becoming one-sided. At the other end of the spectrum, we may not want to work with a "gold-plated" implementer unless we are doing a “proof-of-concept” evaluation of the type discussed above. There are two risks here: that the program is so expensive that it will never be cost-effective even if it is effective; and that it relies on unusual and difficult-to-reproduce, noncash resources from a few highly dynamic mentors that would be hard to replace. An implementer working at a very big scale is unlikely to run a gold-plated program and has already shown the program can be scaled. It is also possible to work with a smaller implementer that closely follows a model used by others.

iv) Local expertise and reputation

Implementers who have been working with a population for many years have in-depth knowledge of local formal and informal institutions, population characteristics, and geography that is invaluable in designing and implementing an evaluation. What messages are likely to resonate with this population? What does success look like and how can we measure it? When I started working in Sierra Leone I spent a long time traveling round the country with staff from Statistics Sierra Leone, Care, and the Institutional Reform and Capacity Building Project. I learned that it was socially acceptable to ask about the bloody civil war that had just ended but that asking about marital disputes could get us thrown out of the village. From Tajan Rogers l learned that every rural (and some urban) communities in Sierra Leone come together for “road brushing,” where they clear encroaching vegetation from the dirt road that links their community to the next and even build the common palm-log bridges over rivers. How often this activity took place and what proportion of the community took part became our preferred measure of collective action and has been used in many papers since.

Just as importantly, an implementer who has been working locally has a reputation in local communities that it would take a researcher years to build. This reputation can be vital. We learn little about the impact of a program if suspicion around the implementer means that few take up the program.

Researchers need to understand how valuable this reputational capital is to the implementer. What may seem like reluctance to try new ideas may be a fully justified caution to put their hard-won reputation on the line.

v) Low staff turnover

There are many difficulties in working with governments and donor organizations, but perhaps the hardest to overcome is high staff turnover. As we have emphasized, evaluation is a partnership of trust and understanding and this takes time to build. All too often a key government or donor counterpart will move on just as an evaluation is reaching a critical stage. Their successor may be less open to evaluation, want to test a different question, be against randomization, or just uninterested. The only way a researcher can protect the evaluation is to try and build relationships at many levels throughout the implementing organization so that the loss of one champion does not doom the entire project. But this may not be sufficient. One of the many advantages of working with local NGOs is that they tend to have greater stability in their staffing.

vi) Desire to know the truth and willingness to invest in uncovering it

The most important quality of an implementing partner is the desire to know the true impact of an intervention and a willingness to devote time and energy to helping the researcher uncover the truth. Many organizations start off enthusiastic about the idea of an evaluation but at some point realize it is possible a rigorous evaluation may conclude their program if it does not have a positive impact. At this point, two reactions are possible: a sudden realization of all the practical constraints that will make an evaluation impossible, or a renewed commitment to learn.

In Running Randomized Evaluations (Glennerster and Tavarakasha 2013), we quote Rukmini Banerji of Pratham at the launch of an evaluation of Pratham's flagship “Read India” program:

"[The researchers] may find that it doesn't work. But if it does not work, we need to know that. We owe it to ourselves and the communities we work with not to waste their and our time and resources on a program that does not help children learn. If we find that this program isn't working, we will go and develop something that will."

It is not just that an unwilling partner can throw obstacles in the path of an effective evaluation. An implementing partner needs to be an active and committed member of the evaluation team. There will inevitably be problems that come up during the evaluation process which the implementer will have to help solve, often at a financial or time cost to themselves. The baseline may run behind schedule and implementation will need to be delayed till it is complete; transport costs of the program will be higher as implementation communities will be further apart than they otherwise would be to allow for comparison groups; roll-out plans must be set further in advance than normal to allow for the evaluation; selection criteria must be written down and followed scrupulously, reducing discretion of local staff; and some promising program areas must be left for the comparison group. Partners will only put up with these problems and actively help solve them if they fully appreciate the benefits of the evaluation being high quality and they understand why these restrictions are necessary.

This commitment to the evaluation needs to be at many levels of the organization. If the headquarters in Delhi want to do an impact evaluation but the local staff don’t, it is not advisable for HQ to force the evaluation through because it is the staff at the local level who will need to be deeply involved in working through the details with the researcher. Similarly, if the local staff are committed but the HQ is not, there will be no support for the extra time and cost the implementer will need to participate in the study. Worst of all is when a funder forces an unwilling implementer to do an RCT run by a researcher. Being involved in a scenario of this kind will suck up months of a researcher's time trying to come up with evaluation designs that the implementer will in turn find some way to object to.

If this level of commitment to discovering the unvarnished truth sounds a little optimistic, there are practical ways to make an impact evaluation less threatening to a partner. An implementer who does many types of programs has less at stake from an impact evaluation of one of their programs than an organization that has a single signature program. Another option is to test different variants of a program rather than the impact of the program itself. For example, testing the pros and cons of weekly versus monthly repayment of microcredit loans (Field and Pande 2008) is less threatening than testing the impact of microcredit loans. In some cases researchers have started relationships with implementers by testing a question that is less threatening (although potentially less interesting). As the partnership has built up trust, the implementing partner has opened up more and more of their portfolio to rigorous testing.

Running Randomized Evaluations: A Practical Guide

Running Randomized Evaluations: A Practical Guide

Blog

Running Randomized Evaluations: A Practical Guide

i) Sufficient scale

ii) Flexibility

iii) Technical programmatic expertise, yet representative

iv) Local expertise and reputation

v) Low staff turnover

vi) Desire to know the truth and willingness to invest in uncovering it