What If Historical Data Matters?

Kaisa Vaittinen
Mar 19
4 min read

Why organizational development effectiveness data should be collected structurally from the start

Think back for a moment. How many training programs, development initiatives, or strategic change projects has your organization implemented in the past five years?

Probably several. Perhaps dozens.

And for how many of them does structured data exist on what change was pursued, what was measured, and how results developed over time?

Probably very few, if any. And if data does exist, it is often scattered across Excel files, PowerPoint presentations, and individual consultants' reports that no one can find anymore.

This is a central structural problem in organizational development: too often, every initiative starts from zero.

Institutional Amnesia

When an organization launches a new leadership development program, it typically asks: "what are we trying to achieve this time?" And that is a good question. But it rarely asks: "what have we learned from all the previous times we tried to develop leadership?" Not because the question is uninteresting. But because the data does not exist. Or it exists in a form that does not allow comparison.

Organizational research has described this problem for decades. Walsh and Ungson (1991) analysed organizational memory through five internal retention bins and external sources, showing how organizations systematically fail to retain knowledge. Crossan, Lane, and White (1999) demonstrated that the critical breakdown occurs when individual learning is meant to become organizational practice. Pollitt (2000) captured the phenomenon with the concept of "institutional amnesia": organizations forget what they have learned despite advances in information systems. Argote (2013) documented empirically how productivity gains from experience depreciate over time without active knowledge retention.

Beer, Finnstr\u00f6m, and Schrader (2016) documented in the Harvard Business Review how global training investments are very large, but their returns are often poorly verified. According to ATD's 2025 State of the Industry report, organizations spent an average of $1,254 per employee on development in 2024, yet measurement typically focuses on activity volume rather than impact.

What Is Learned Decays Without Follow-Up

Transfer research makes the problem even more concrete. Saks and Belcourt (2006) documented across 150 organizations that 62 percent of participants applied what they had learned immediately after training, but only 44 percent at six months and just 34 percent at one year. Blume et al.'s (2010) meta-analysis confirmed the same pattern: transfer effects weaken with longer measurement intervals.

Saks and Burke's (2012) study showed that evaluation frequency positively predicted transfer, but behaviour- and results-level evaluation was considerably more strongly associated with transfer than measuring satisfaction or learning alone. In other words, measurement itself can support the transfer of learning into practice, but only if the right things are measured.

Without structured follow-up data, these processes remain invisible. The organization does not know whether the effect decayed, persisted, or strengthened over time.

What Would Historical Data Make Possible?

Imagine that your organization had five years of structured data from every development initiative: what phenomenon was studied, how results developed, and how the effects of different initiatives compare to one another.

With such data, you could ask questions that cannot be asked today.

For example: "Our leadership program consistently produces large changes in psychological safety but only small changes in decision-making transparency. Why? Is this a content problem in the program, or a structural barrier that training cannot resolve?"

Or simply: "How much should we budget for leadership development next year, and on what basis?" An answer grounded in data rather than intuition.

This is the power of historical data: it turns individual initiatives into cumulative learning. As data accumulates, uncertainty typically decreases and estimates sharpen, provided the measurement structure remains sufficiently consistent. The organization's "measurement memory" strengthens with use.

Comparability Does Not Require Standardization

A common objection is that comparing initiatives requires a standardized instrument: everyone uses the same survey, the same items, the same scale. This works in research settings, but it often fails in organizations because every phenomenon, context, and target group is different.

However, psychometrics has developed methods that enable comparison without content standardization (Curran & Hussong, 2009; Bauer & Hussong, 2009). The core principle is that a measurement framework can be unique in content while remaining structurally comparable. Comparability comes from aggregability, not from sameness of content.

What If Historical Data Already Exists?

Many organizations have been collecting data for years in various formats: Excel files, final reports, individual survey results. This data is not worthless, even if it is not structurally consistent. A structural framework can be created retroactively for existing data, enabling comparison. This is not a perfect solution, but it is considerably better than the alternative of leaving old data unused.

Measurement Infrastructure Is a Strategic Investment

What happens when individual measurements accumulate into dozens, and dozens into hundreds? Comparability emerges. And comparability is what transforms historical data from an archive into a foundation for learning.

Based on this thinking, evaluoi.ai is designed to support structural and cumulative measurement. And even if you are only starting now, that is fine. We can build a measurement framework for past initiatives, bring in existing data, and begin cumulative learning from what you already have. Your history will not go to waste.

References

Argote, L. (2013). Organizational Learning: Creating, Retaining and Transferring Knowledge (2nd ed.). Springer.

ATD Research. (2025). 2025 State of the Industry: Talent Development Benchmarks and Trends. ATD Press.

Bauer, D. J., & Hussong, A. M. (2009). Psychometric approaches for developing commensurate measures across independent studies. Psychological Methods, 14(2), 101–125.

Beer, M., Finnstr\u00f6m, M., & Schrader, D. (2016). Why leadership training fails and what to do about it. Harvard Business Review, October 2016.

Blume, B. D., Ford, J. K., Baldwin, T. T., & Huang, J. L. (2010). Transfer of training: A meta-analytic review. Journal of Management, 36(4), 1065–1105.

Crossan, M. M., Lane, H. W., & White, R. E. (1999). An organizational learning framework: From intuition to institution. Academy of Management Review, 24(3), 522–537.

Curran, P. J., & Hussong, A. M. (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14(2), 81–100.

Pollitt, C. (2000). Institutional amnesia: A paradox of the 'information age'? Prometheus, 18(1), 5–16.

Saks, A. M., & Belcourt, M. (2006). An investigation of training activities and transfer of training in organizations. Human Resource Management, 45(4), 629–648.

Saks, A. M., & Burke, L. A. (2012). An investigation into the relationship between training evaluation and the transfer of training. International Journal of Training and Development, 16(2), 118–137.

Walsh, J. P., & Ungson, G. R. (1991). Organizational memory. Academy of Management Review, 16(1), 57–91.