top of page

SMART + SANE: A Framework for Meaningful Measurement

  • Writer: Kaisa Vaittinen
    Kaisa Vaittinen
  • Jan 9
  • 5 min read

The Problem with SMART Alone


SMART goals are everywhere. Specific, Measurable, Achievable, Relevant, Time-bound. The framework has shaped how organisations think about objectives for decades.

But SMART has a blind spot: it assumes that "measurable" is binary. Either you can measure something, or you cannot.


In practice, measurement quality exists on a spectrum. A goal can be technically measurable but measured poorly. The numbers exist, but they mislead rather than inform. Consider a leadership development program. The SMART goal might be: "Increase leadership competency scores by 15% within 12 months." This is specific, measurable, achievable, relevant, and time-bound.


But what does "leadership competency" actually mean in this context? Is the instrument capturing what matters? Will a 15% increase reflect genuine change or just measurement noise?


SMART tells us nothing about this.


Introducing SANE: Criteria for Measurement Quality


If SMART asks "can we measure this?", SANE asks "are we measuring it well?"

SANE provides four criteria that complement SMART by focusing on the quality of measurement itself.


S – Specific to the Phenomenon


The measure must reflect the actual phenomenon, not a convenient proxy.

Measuring "training hours completed" is not the same as measuring "learning transfer." Measuring "survey responses" is not the same as measuring "engagement." Measuring "meetings held" is not the same as measuring "collaboration quality."

Specificity requires defining the phenomenon first, then building measures that fit. This sounds obvious, but most organizational measurement works the other way around: starting with available data or existing instruments, then retrofitting them to whatever needs measuring.


Generic instruments cannot be specific to phenomena they were not designed for. A standardized engagement survey developed in one context may miss what engagement actually means in yours.


This is why phenomenon-first measurement matters. Before asking "how do we measure this?", ask "what exactly are we trying to understand?"


A – Actionable in Interpretation


Results must guide decisions. A score of 3.8 out of 5 means nothing without context.

What would 4.2 mean? What would 3.2 mean? What actions follow from different results?

Actionable measures are designed with decision-making in mind. They distinguish between states that require different responses. If every possible result leads to the same action (or no action), the measurement serves no practical purpose.


This criterion challenges the common practice of measuring for measurement's sake. Before creating an instrument, ask: "What will we do differently based on what we learn?" If the answer is unclear, the measurement may not be worth doing.


N – Nuanced in Uncertainty


Good measurement acknowledges what it does not know.


Instead of reporting a single number, nuanced measurement communicates the range of plausible values and the confidence in conclusions. This is particularly important when sample sizes are small or when measuring complex psychological phenomena.

Modern statistical approaches help here. Rather than asking "is the effect statistically significant?", nuanced measurement asks "what is the probability that the effect exceeds a meaningful threshold?"


For example: "There is an 87% probability that reliability exceeds acceptable levels" is more useful than "p < 0.05." The first statement directly answers the question we care about. The second requires interpretation that is frequently done incorrectly.


Nuance also means acknowledging the limits of what measurement can tell us. A pre-post difference does not automatically mean the intervention caused the change. Triangulation—using multiple measurement approaches—increases confidence without requiring certainty.


E – Evidenced Through Validation


The measure must demonstrate that it captures what it claims to capture.

Validation is not a one-time event that happens before measurement begins. It is an ongoing process of gathering evidence that the instrument works as intended.


Three types of evidence matter:


Substantive evidence comes from the process of building the measure. Was the phenomenon clearly defined? Were items developed systematically? Did subject matter experts review the instrument?

Structural evidence comes from analyzing how the measure behaves. Do items that should correlate actually correlate? Does the factor structure match the intended dimensions? Is reliability adequate?

External evidence comes from comparing the measure to other sources. Do results align with what would be expected based on theory? Do triangulated data sources converge on similar conclusions?


Validation does not require traditional experimental designs with large samples and control groups. Pragmatic validation through triangulation can provide sufficient evidence for applied contexts. When self-report data, observational data, and administrative data point in the same direction, confidence increases even without formal validation studies.


SMART + SANE in Practice


Combining SMART and SANE creates a more complete framework for goal-setting and measurement.

Criterion

Question

Focus

Specific (SMART)

What exactly do we want to achieve?

Goal clarity

Measurable (SMART)

Can we quantify progress?

Data availability

Achievable (SMART)

Is this realistic?

Feasibility

Relevant (SMART)

Does this matter?

Strategic alignment

Time-bound (SMART)

When will we evaluate?

Timeline

Specific to phenomenon (SANE)

Does the measure fit what we're actually studying?

Construct validity

Actionable (SANE)

Will results guide decisions?

Practical utility

Nuanced (SANE)

Do we acknowledge uncertainty appropriately?

Epistemic honesty

Evidenced (SANE)

Do we have validation evidence?

Measurement quality

SMART without SANE risks generating meaningless numbers. SANE without SMART risks measurement that serves no strategic purpose.


A Worked Example


An organization wants to improve psychological safety in teams.


SMART goal: "Increase psychological safety scores by 20% across all teams within 12 months."


This goal is specific, measurable, achievable (assuming baseline data suggests room for improvement), relevant (assuming psychological safety matters for team performance), and time-bound.


But is it SANE?


Specific to phenomenon: What does "psychological safety" mean in this organization? Is it about speaking up in meetings? Admitting mistakes? Challenging senior leaders? A generic psychological safety scale may not capture what matters here. The measure should be developed—or at least validated—in context.


Actionable: What happens at 10% improvement versus 25%? What happens if some teams improve while others decline? The goal should connect to specific interventions that can be adjusted based on results.


Nuanced: A 20% increase from baseline may or may not be meaningful depending on measurement error and sample size. Rather than a single target, consider: "Achieve at least 80% probability that psychological safety has improved by a practically meaningful amount."


Evidenced: Before large-scale measurement, validate the instrument. Pilot with a subset of teams. Check that items are understood as intended. Compare self-report data with observable behaviours or meeting dynamics.


When SANE Matters Most


SANE criteria become increasingly important when:


The phenomenon is latent. You cannot directly observe psychological constructs. Measurement is always inference from indicators. Getting this right requires care.

Stakes are high. If decisions affecting people's careers, team structures, or significant resources depend on measurement results, quality matters more than convenience.

Sample sizes are small. Most organizational measurement happens with tens or hundreds of respondents, not thousands. Statistical approaches designed for large samples may mislead.


Context is unique. Generic instruments developed elsewhere may not fit. The gap between "what the instrument measures" and "what you need to know" grows with contextual distance.


Change is the goal. Detecting genuine change over time requires instruments sensitive to change, not just instruments that produce numbers at two time points.


Conclusion


SMART goals have served organizations well by bringing discipline to objective-setting. But "measurable" is not enough. Measurement can be done well or poorly, and the difference matters.


SANE criteria—Specific to phenomenon, Actionable, Nuanced, Evidenced—provide a framework for evaluating measurement quality. They shift the question from "can we measure this?" to "are we measuring it in a way that will actually inform good decisions?"


The goal is not perfect measurement. Perfect measurement is impossible for complex human phenomena. The goal is measurement that is honest about what it can and cannot tell us, and useful for the decisions we need to make.


evaluoi.ai builds measurement intelligence into every step: from phenomenon-driven instrument development to uncertainty-aware analytics. Because measurement should inform decisions, not just generate numbers.

bottom of page