Data Science and Design Thinking for Better Development
Methodology

A synopsis by Walid Madhoun, updated with reflections on the rise of AI in development practice. Based on the original article by Walid Madhoun and Johannes Wheeldon (2022).

The article argues that development organisations must move beyond counting outputs and use data science, embedded in a Design Thinking approach, to understand how projects improve people's lives. This synopsis updates the article with insights on the emergence of AI several years after its initial publication.

Read the full article on Academia
Data Science and Beneficiary-Centered Design for Better Development Outcomes

The paper, Data Science and Design Thinking for Better Development, proposes a practical way for development organisations to move beyond counting outputs and start using data science to understand how their work changes people's lives. It argues that to do this credibly, institutions need to reframe their results frameworks, embed Monitoring and Evaluation (M&E) within a Design Thinking-style institutional cycle, and build the data architecture and culture that allow projects to generate and use information.

International financial institutions and bilateral agencies already use familiar tools: results frameworks, safeguard systems, and various performance indicators. Over time, these tools have improved, and new technologies such as data science have emerged, promising cheaper data collection, faster analysis, and the ability to explain what happened, why, for whom, and who was left out. Yet, despite scattered innovations, data science is not applied in a systematic way across organisations or portfolios. This is now exacerbated by the inefficient use of AI as the new shiny tool in the development toolbox.

The paper identifies a structural reason for this: current institutional practices emphasise easily measurable outputs and simple quantitative outcomes, while often postponing any serious measurement of "effect" or "impact" to occasional evaluations or research projects. In this environment, data science tools are underused, and performance measurement risks becoming a compliance exercise rather than a way to learn and improve.

Why data science is not a silver bullet — on its own

The author defines data science broadly as covering data collection, extraction, transformation, management, integration, analysis, interpretation, and reporting — this must now include AI as a mechanism to perform many of these tasks. In this sense, it includes big data, data warehouses, and applications such as machine learning and artificial intelligence. These tools clearly have potential: for example, satellite imagery combined with machine-learning algorithms can help map poverty or monitor infrastructure use more rapidly and at lower cost than traditional surveys.

However, development operates in a non-binary space shaped by culture, politics, and social norms. Data science tools were not designed for this complexity and struggle to capture nuance or socio-cultural drivers. Machine learning can misidentify individuals and communities due to biases in training data, as starkly shown in facial recognition systems. Models also require large datasets; yet innovative or niche development projects often lack enough historical data to train robust algorithms. The widespread use of AI today digs a deeper hole as AI miscues nuance, human judgement, and exceptions that often provide valuable insights.

Citing warnings against a "silver bullet fallacy," the paper argues that data science and AI should not be treated as a universal solution but used judiciously, with clearly defined goals and an honest understanding of its limitations. The question is not whether to use data science, but how to integrate it into the institutional fabric so that it supports better design, implementation, and learning.

Seeing the institutional cycle through Design Thinking

To create that enabling environment, the paper draws an explicit parallel between the development Institutional Cycle and the Design Thinking cycle. It distinguishes between the project cycle (the life of a single operation) and the broader Institutional Cycle, which covers country strategies, policy dialogue, project preparation and implementation, ex-post evaluation, and lessons feeding back into strategy.

Design Thinking, as practiced in fields like product design and user experience, revolves around understanding users, defining their needs, ideating solutions, prototyping, testing, and implementing. Its history spans work on "wicked problems," rapid prototyping, and human-centered design, popularised by institutions like Stanford's design school and firms like IDEO. It seeks a balance between desirability (what users want), feasibility (what is technically and legally possible), and viability (what is economically sustainable).

The paper shows that if we relabel the stages of the Institutional Cycle in Design Thinking terms, the similarities are striking. Country strategies correspond to understanding and empathising with the user (citizens and governments); sector and thematic diagnostics help define needs; internal and external consultations generate project ideas; feasibility studies and pilots act as prototypes; preparation missions test assumptions with stakeholders; and implementation becomes an iterative process of testing, learning, adjusting, and capturing lessons. This mapping is more than cosmetic. Design Thinking places M&E along the entire cycle, not just at the end, making data collection and analysis a continuous feature of design, implementation, and adaptation. By adopting this mindset, development organisations can create multiple entry points where data science can add value: from initial problem framing and user research, through real-time performance monitoring, to ex-post learning across portfolios.

Combined Design Thinking project life cycle: Research, Define, Ideate, Prototype, Test, Implement, shown as concentric arcs around a project timeline.
Figure 1: Combined Design Thinking project life, shown in actual application — the cycle wraps around the duration of a real engagement, with stages overlapping rather than running strictly in sequence.

Reframing results: from outputs to effects

The heart of the paper is a critique of typical project results frameworks and a proposal for reframing them in a way that both supports data science and generates more meaningful information about project effects. Using a generic multilateral development bank (MDB) project, the author shows how current frameworks often elevate outputs to intermediate outcomes and intermediate engagement metrics to "highest-level results." A common pattern is to track the number of people trained, reports produced, or communication events held, while rarely monitoring whether capacities changed, behaviours shifted, or people's conditions improved.

To illustrate the problem, the paper contrasts a "typical" and an "ideal" results framework for a teacher training project. In the ideal version, outputs include "100 teachers trained," the intermediate outcome measures whether teachers use new skills two years later, and the highest-level result captures the effect on students' performance, measured through appropriate methods such as 360-degree feedback. In the typical version, "100 teachers trained" is misclassified as an intermediate result and "75% of teachers using new skills after two years" becomes the highest-level result, effectively erasing the genuine effect level. This pattern, the paper argues, reflects a deeper tendency to treat impact measurement as the domain of sporadic evaluations, and to accept that projects will be judged mainly on disbursement, procurement progress, and easily countable outputs. The result is that organisations can declare success without understanding whether their investments produced meaningful change for the intended population.

"The main barrier is not technology but culture. M&E is still widely perceived as a compliance requirement, rather than as a strategic function for learning and improving performance."

Learning from the Canadian results model

To remedy this, the paper draws on the Government of Canada's results framework, which clearly distinguishes between outputs, immediate outcomes (changes in capacity and capability), intermediate outcomes (changes in behaviour), and ultimate outcomes (changes in state or well-being). Immediate outcomes cover shifts in knowledge, skills, attitudes, and willingness; intermediate outcomes capture how people behave differently by the end of a project; ultimate outcomes sit higher up, closer to long-term development goals such as poverty reduction or improved health.

By introducing a formal intermediate-outcome level into MDB project log frames, the author argues, teams are forced to think more clearly about what constitutes a reasonable, perceptible change during the life of a project and how to measure it. This in turn creates rich opportunities for data science: mobile phone records, geospatial data, user satisfaction surveys, and administrative records can all be harnessed to track intermediate behavioural changes in near real time. A light-rail project design and monitoring framework (DMF) is used to show how this works in practice.

From project data to institutional performance

Once results frameworks are reframed in the new method, data science can be systematically integrated along all levels. At the output level, operational data and simple analytics are sufficient. At the immediate and intermediate outcome levels, organisations can combine performance records, cross-cutting data on inclusion or accessibility, and big data sources to build a more nuanced picture of behavioural change. At the impact and institutional level, more advanced tools — including AI and natural language processing — can help synthesise diverse data streams and estimate contributions to overarching objectives like poverty reduction.

The hardest challenge is attribution or, more precisely, contribution. When multiple projects in different sectors — urban transport, energy, TVET, SME support — contribute to outcomes like employment, how much credit should each receive? Simple weighting schemes based on project size, scope, or reach quickly become unmanageable and possibly arbitrary. The paper suggests that a more promising route is to use AI on large, multi-donor historical datasets to estimate the typical contribution of different project types to impacts such as employment and, in turn, to institutional goals such as poverty reduction. In other words, AI has the capacity to use historical data to understand what is contributed by each project based on size, scope, geographic distribution, and other factors. It can also help project designers predict, to some extent, the performance of future projects.

Data architecture and cultural change

To support this, the author outlines the need for a robust data architecture centred on an extended or logical data warehouse that can ingest structured, semi-structured, and unstructured data. This warehouse would integrate project M&E data, surveys, administrative information, big data sources, and historical records from multiple MDBs and bilateral donors, subject to appropriate agreements. It would then feed analytics and data science processes whose outputs inform project design, mid-course corrections, and institutional performance reporting.

Crucially, the paper argues that the main barrier is not technology but culture. M&E is still widely perceived by borrowers as a compliance requirement, and by some staff as a box-ticking exercise linked to disbursement and procurement, rather than as a strategic function for learning and improving performance. Some officials within multilateral and bilateral agencies consider M&E from an academic perspective, insisting on scientific methods adapted for the laboratory rather than adapted quasi-scientific methods better suited for the human environment of people's lives. Shifting this mindset requires allocating more resources to M&E in project budgets, embedding it from the earliest stages of the Design cycle, and convincing borrowers of the value of effect-focused M&E for their own systems and policies.

The proposal is evolutionary, not disruptive. It builds on existing tools, frameworks, and experiments, but reframes them so that projects are held accountable to a reasonable degree of effect within their lifetime, and institutions can credibly estimate how those projects contribute to ultimate development goals. Especially in the wake of the COVID-19 crisis, it is no longer acceptable to infer impact from outputs alone; development organisations must embrace Design Thinking and data science / AI together if they are to show, with evidence, how their work improves the lives of the citizens they serve.

Walid Madhoun
Walid Madhoun
Partner, Strategy
View profile →

Continue reading

All Insights