When Measurement Becomes the Intervention

Kaisa Vaittinen
Feb 18
12 min read

What self-regulated learning theory, the theory of planned behavior, and self-efficacy research reveal about why measurement changes behavior, and how this can be leveraged in organizational evaluation using scaffolding principles?

There is a well-documented finding in the behavioral sciences that is simultaneously widely known and chronically underutilized: simply asking people questions about their behavior changes their behavior.

The phenomenon is known by several names: mere-measurement effect (Morwitz & Fitzsimons, 2004), question-behaviour effect (Sprott et al., 2006), and measurement reactivity (French & Sutton, 2010). Evidence for this phenomenon has accumulated across dozens of different contexts:

Students who were asked about their voting intentions were more likely to vote (Greenwald et al., 1987).
Consumers who were asked about their purchase intentions changed their purchasing behavior (Morwitz, Johnson & Schmittlein, 1993).
Respondents to a blood donation survey donated blood more frequently (Godin et al., 2008).

In Sherman's (1980) original experiments, the effect was dramatic, as large as 27–28 percentage points. Subsequent meta-analytic reviews have shown that the effect varies by context and is on average more modest, typically small to medium, but the direction is consistent: asking changes behavior (Sprott et al., 2006).

In traditional research methodology, the phenomenon is treated as a validity threat. The threat is minimized through measures such as the Solomon four-group design or careful protocol management. And if measurement is seen as something that aims at objective truth, this logic is sound: if the act of measurement itself changes the phenomenon being measured, observed changes cannot be purely attributed to the intervention. Measurement has contaminated the data.

But could the question-behavior effect be approached differently? What if this effect were put to deliberate use, making measurement an intentional part of the intervention designed to produce desired change? What if this so-called contamination is actually the primary mechanism through which organizational learning occurs?

Measurement as an integral part of the intervention itself, that is, making this kind of deliberate choice, connects to intervention design and, at a principled level, to what role measurement is seen to play within a change initiative.

Why Asking Changes Behavior

Before moving to the organizational context, it is worth pausing to examine what actually happens behind the question-behavior effect. Two theories offer complementary explanatory models.

Ajzen's Theory of Planned Behavior (Ajzen, 1991) describes the structure of intention formation. According to Ajzen, behavioral intention, which is the strongest single predictor of behavior, is composed of three factors: attitude toward the behavior, subjective norm (what others expect), and perceived behavioral control (whether I believe I am capable). When a person is asked about their behavior or intentions, these three components are activated. The question makes the attitude more accessible, brings social expectations to mind, and forces an evaluation of one's own ability to act.

Ajzen's perceived behavioral control is closely related to Bandura's (1997) concept of self-efficacy, which describes an individual's belief in their own ability to perform a specific task. Bandura demonstrated that self-efficacy is not a fixed trait but a dynamic belief shaped by four sources: mastery experiences, vicarious experiences through observing others' performance, social persuasion, and interpretation of physiological states. When measurement asks a person to evaluate their own actions, this evaluative prompt activates the first and perhaps most important of these sources, namely forcing them to process their own prior experiences in relation to the target behavior. This processing constitutes active modification of self-efficacy beliefs.

Morwitz and Fitzsimons (2004) have described this overall process as increased attitude accessibility: posing a question elevates the attitude toward the target behavior to cognitive foreground, which strengthens the intention-behavior link. From Bandura's perspective, the same process simultaneously forces an evaluation of one's own capability in relation to the behavior. In other words, merely asking makes behavior more likely because it activates both the psychological structures from which intention is formed (Ajzen) and the beliefs that determine whether a person feels capable of acting (Bandura).

This is an important theoretical foundation for what follows in applying measurement to the organizational context. We are dealing with a well-understood psychological mechanism in which asking activates attitudes and intentions (Ajzen) and simultaneously shapes beliefs about one's own capability (Bandura).

The Self-Regulation Connection

Barry Zimmerman's cyclical model of self-regulated learning (Zimmerman, 2000; Zimmerman & Moylan, 2009) brings a third theoretical layer to this discussion, one that ties the mechanisms described above together into a process model. Zimmerman's model is grounded specifically in Bandura's social cognitive theory, and self-efficacy serves as its central energizing force: in the forethought phase, self-efficacy beliefs influence what goals the learner sets and how committed they are to pursuing them. The model describes learning as a recursive cycle through three phases: forethought, performance, and self-reflection.

In the forethought phase, the learner analyzes the task, sets goals, and activates motivational beliefs. In the performance phase, they execute the task while monitoring their progress. In the self-reflection phase, they evaluate outcomes against the standards they set, make causal attributions, and produce inferences that feed back into the next cycle.

In this model, self-monitoring is an active cognitive process that changes the behavior being monitored. It produces the information that enables self-evaluation, which produces the attributions that shape future forethought processes. Without self-monitoring, there is no data for self-reflection. Without self-reflection, there is no adaptation. If this happens, the process stalls.

When this is juxtaposed with Ajzen's and Bandura's models, the picture becomes complete: asking activates attitudes and intentions (Ajzen), shapes self-efficacy beliefs (Bandura), and the self-monitoring cycle keeps these processes active and directs behavior toward the goal (Zimmerman). These three theories complement the description of the same phenomenon at different levels.

Three Phases, One Measurement Process

When an organization decides to measure the effectiveness of a leadership development program or post-training behavioral change, ideally the measurement is conducted as a series of phases that follow Zimmerman's model.

Forethought: determining what to measure. Before any data is collected, someone must decide what matters. What does "effective leadership" look like in this organization? Which behaviors should change? This process, when conducted dialogically with stakeholders, activates precisely the mechanisms predicted by the theories described above: it forces participants to articulate the standards against which they will later evaluate themselves (Zimmerman), makes attitudes and norms related to the target behavior cognitively salient (Ajzen), and initiates the evaluation of self-efficacy beliefs in relation to the goals being set (Bandura).

A team that has collectively defined "psychological safety" as a measurable dimension has already begun paying attention to it differently.

Performance: measurement itself as self-monitoring. When participants respond to a survey about their own leadership behavior or reflect on how frequently they apply new skills, they are engaging in self-monitoring. They systematically observe and record features of their own behavior. Simultaneously, in line with Ajzen's model, their attitudes, norm perceptions, and perceived control are reactivated, and in line with Bandura's model, they are forced to confront their own self-efficacy in relation to the targeted change: am I capable of this, have I progressed, is the change realistic?

Self-reflection: reviewing the results. When measurement data is analyzed and results are presented to participants, they compare actual performance against the standards set in the forethought phase. They make attributions and produce inferences about what to do next.

This feedback loop is the mechanism through which measurement can generate and sustain organizational learning.

Two Modes of Measurement, One Cycle

At this point, we bring intervention and educational research, along with evidence-based practice, into the discussion. In intervention and educational contexts, two different modes of measurement operate within the same process. The first is measurement as a scaffold (the term formative assessment is also used, which is not quite the same thing, but close to it), designed to promote learning and produce controlled behavioral change. This kind of measurement creates opportunity and guides change. The second mode is evaluative measurement (summative assessment is also used, which again is not exactly the same thing, but close), which seeks to capture change as objectively as possible. Both modes of measurement are necessary.

The distinction I make in this text between measurement as a scaffold and evaluative measurement is a theoretical synthesis, not established terminology in the literature. I think of it as a design principle that I apply on the basis of Zimmerman's, Bandura's, and Ajzen's theories, leveraging scaffolding methods, to the organizational context.

Measurement as a scaffold is goal-oriented. It does not pretend to be neutral. Its purpose is to activate self-regulatory processes: to prompt participants to reflect, to bring target behavior into awareness, to support the transfer of new competencies into everyday practice. In a leadership development program, this might look like daily reflection prompts ("What did I do today to create space for others?"), peer check-ins during team meetings ("What have I concretely done for my colleagues this week?"), or structured real-time 360-degree feedback during facilitated sessions.

These are not intended to be neutral data collection methods. They are specifically designed as scaffolding tools whose purpose is to support the development of self-regulation and thereby learning.

The scaffolding concept, originating from Vygotsky's zone of proximal development and developed further by Wood, Bruner, and Ross (1976), describes support that is deliberately designed to be gradually withdrawn as the learner develops independent capability. In the measurement context, this means that measurement as a scaffold should be most intensive at the beginning of the change process, when participants need external impulses to activate self-monitoring, and should decrease as self-regulatory habits internalize. At a later stage of the change initiative, or after its completion, scaffolding elements are still worth using to keep the targeted change present in everyday practice over time.

This is directly in line with Zimmerman's multi-level development model, which describes progression from observation through emulation to self-control and ultimately self-regulation. At the earliest levels, learners are dependent on external models and structured feedback. At the highest level, they monitor themselves without external guidance.

Research on self-efficacy interventions illustrates this concretely. For example, in mathematics education, combining self-regulated learning strategies with scaffolding support has been shown to strengthen self-efficacy beliefs. Critically, this does not happen by telling students that they are capable, which would correspond to Bandura's weakest source, verbal persuasion, but by creating structured opportunities to observe and evaluate their own progress (Zimmerman, 2000), which corresponds to Bandura's strongest source, mastery experiences. Measuring self-efficacy beliefs during an intervention is itself a scaffolding device: it forces learners to articulate and confront their own capability beliefs, which is a prerequisite for changing them.

The question-behavior effect is at its most powerful when measurement operates as a scaffold.

A reflective question is designed to change behavior by increasing the cognitive accessibility of the target behavior (Morwitz & Fitzsimons, 2004). A professional who uses reflective in-process check-ins is not ruining the data. They are using a theory-based scaffolding tool that simultaneously produces useful process information.

It is important to state this boundary clearly: in experimental research, reactivity remains a genuine validity threat that is controlled through appropriate methods. Here, we are discussing applied organizational contexts where change is the goal, not an unwanted side effect.

Evaluative measurement aims to capture change with minimal reactive distortion.

This type of measurement can be a pre-during-post psychometric instrument, behavioral observation, or a construct-validated survey administered at planned time points. Its purpose is not to support learning but to answer the question: did something actually change, and by how much?

In Zimmerman's model, this corresponds to the self-evaluation subprocess of the self-reflection phase, where actual performance is compared against the standards that were set. The quality of self-evaluation depends on the quality of the goals that were set and the accuracy of the data.

This is where psychometric rigor is most critical. Construct validity (Flake et al., 2017), appropriate reliability estimation (McDonald's omega based on polychoric correlations for ordinal data rather than Cronbach's alpha), careful item construction, and attention to response bias are prerequisites for evaluative measurement to function as neutrally as possible.

It must also be acknowledged that measurement completely free of reactivity is practically impossible. Evaluative measurement, too, affects the phenomenon being measured to some degree. But the effect can be minimized through appropriate instrument design, and this minimization is what is required of evaluative measurement.

The Design Challenge: Both Modes Within the Cycle

The real question is not whether measurement changes behavior. It does. The important question is: how to design a measurement process that deliberately uses measurement as a scaffold to drive change while simultaneously preserving the reliability of evaluative measurement to capture that change?

In Zimmerman's framework, self-monitoring (scaffold) and self-evaluation (evaluative) are different subprocesses within the same regulatory cycle. They need each other. Self-monitoring without self-evaluation produces activity without intentional control. Self-evaluation without self-monitoring produces guesswork, estimates without genuine understanding.

In practice, this has several implications.

The content and framing of measurement as a scaffold and evaluative measurement should differ from each other. Measurement as a scaffold can employ open reflection prompts, peer check-ins embedded in team meetings, and behavior-anchored self-observations. Evaluative measurement should use items that are validated as rigorously as possible, minimizing leading effects and the influence of social desirability. Using the same instrument for both purposes weakens both.

The process in which what is measured is defined belongs to the forethought phase and is a scaffolding element by nature. When a team collectively defines the measurable dimensions and articulates what they look like in their environment, they have already changed their relationship to those constructs. This is not contamination of subsequent evaluative measurement. It is a necessary precondition for it: change cannot be evaluated against a goal that was never articulated.

The intensity of measurement as a scaffold should follow scaffolding logic: most intensive at the beginning of the change process, gradually decreasing as self-regulatory capability develops. Daily reflection prompts in the first week may become weekly by the second month and disappear entirely by the fourth month. As already noted, scaffolding-type measurement is also worth using after the change initiative has concluded, occasionally, so that the change remains embedded in everyday practice and does not, so to speak, wear off.

The timing of evaluative measurement should follow the self-regulation cycle rather than a simple pre-post design. Evaluative measurement belongs at transition points between cycles, moments when it is meaningful to ask "where are we now relative to where we started?"

And transparency about which mode is being used is an absolute requirement. When reporting results, the distinction between "we used reflective measurement as scaffolding for behavioral change" and "we used validated items to evaluate whether change occurred" must be explicit. Mixing these, that is, presenting supportive self-monitoring data as if it were evaluative outcome data, undermines credibility.

What This Means for Measurement Professionals

The traditional framing positions measurement and intervention as sequential and separable: first the intervention, then the measurement. Zimmerman's theory of self-regulated learning and the scaffolding tradition that extends it, however, conceive of measurement and intervention as integrated and recursive, but not inseparable. Measurement as a scaffold and evaluative measurement are different subprocesses within the same cycle, and each has its own objectives.

Designing a measurement framework is not a neutral act, because the phenomena you name, the dimensions you define dialogically with stakeholders, the reflection prompts you embed in the process; these shape the organization's attention and, through attention, the organization's behavior. This is the scaffolding layer, and it is valuable precisely because it is both purposeful and part of the change itself. It should be designed as supportive: intensive at the start, gradually withdrawn as self-regulatory capability grows.

But the evaluative layer must remain as neutral and reliable as possible. A measurement framework with strong construct validity, whose results are analyzed appropriately using statistical tools, is valuable not merely because it produces more precise numbers, but because it creates an authentic mirror against which self-evaluation can take place. If the mirror flatters, the self-regulation cycle breaks down. If it is honest, it produces actionable inferences for the next improvement cycle.

Measurement Intelligence takes measurement as part of the intervention into account and leverages the mere-measurement effect deliberately and unapologetically, without neglecting evaluative measurement.

References

Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50(2), 179–211. https://doi.org/10.1016/0749-5978(91)90020-T

Bandura, A. (1997). Self-efficacy: The exercise of control. W. H. Freeman.

Flake, J. K., Pek, J., & Hehman, E. (2017). Construct validation in social and personality research: Current practice and recommendations. Social Psychological and Personality Science, 8(4), 370–378. https://doi.org/10.1177/1948550617693063

French, D. P., & Sutton, S. (2010). Reactivity of measurement in health psychology: How much of a problem is it? What can be done about it? British Journal of Health Psychology, 15(3), 453–468. https://doi.org/10.1348/135910710X492341

Godin, G., Sheeran, P., Conner, M., & Germain, M. (2008). Asking questions changes behavior: Mere measurement effects on frequency of blood donation. Health Psychology, 27(2), 179–184. https://doi.org/10.1037/0278-6133.27.2.179

Greenwald, A. G., Carnot, C. G., Beach, R., & Young, B. (1987). Increasing voting behavior by asking people if they expect to vote. Journal of Applied Psychology, 72(2), 315–318. https://doi.org/10.1037/0021-9010.72.2.315

Morwitz, V. G., & Fitzsimons, G. J. (2004). The mere-measurement effect: Why does measuring intentions change actual behavior? Journal of Consumer Psychology, 14(1–2), 64–74. https://doi.org/10.1207/s15327663jcp1401&2_8

Morwitz, V. G., Johnson, E., & Schmittlein, D. (1993). Does measuring intent change behavior? Journal of Consumer Research, 20(1), 46–61. https://doi.org/10.1086/209332

Sherman, S. J. (1980). On the self-erasing nature of errors of prediction. Journal of Personality and Social Psychology, 39(2), 211–221. https://doi.org/10.1037/0022-3514.39.2.211

Sprott, D. E., Spangenberg, E. R., Block, L. G., Fitzsimons, G. J., Morwitz, V. G., & Williams, P. (2006). The question–behavior effect: What we know and where we go from here. Social Influence, 1(2), 128–137. https://doi.org/10.1080/15534510600685409

Wood, D., Bruner, J. S., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Child Psychology and Psychiatry, 17(2), 89–100. https://doi.org/10.1111/j.1469-7610.1976.tb00381.x

Zimmerman, B. J. (2000). Attaining self-regulation: A social cognitive perspective. In M. Boekaerts, P. R. Pintrich & M. Zeidner (Eds.), Handbook of self-regulation (pp. 13–39). Academic Press.

Zimmerman, B. J., & Moylan, A. R. (2009). Self-regulation: Where metacognition and motivation intersect. In D. J. Hacker, J. Dunlosky & A. C. Graesser (Eds.), Handbook of metacognition in education (pp. 299–315). Routledge.