"Test-retest reliability" is a buzz phrase in the assessment world. But
beyond a lofty scientific term, it is an extremely important concept
for you and your business. Understanding reliability is the key to determining
which tool will bring you the most ROI.
So, what exactly is test-retest reliability?
Test-retest reliability is the consistency of an assessment's
findings over time. In other words, how accurate is the assessment 1 week from
now? 5 years from now? 25 years from now? This factor is measured
by administering an assessment to a group of people, allowing time to
pass (ie. 1 month,
1 year, 5 years) and having the same group of people retake the
assessment. If each person receives the same result both times, the test-retest
reliability rating is 1.0. If only half of the group receives the same result,
the test-retest reliability rating is 0.5...and so on.
Here is a quick example:
If I was given an assessment each year with one question, "What is your
first name?" I would have a 1.0 test-retest reliability rating because I would
always answer "Emily"...no matter if it is 1 year from now or 30 years from now.
Why is test-retest reliability important?
Test-retest reliability determines the time value of a
tool. You could consider it an assessment's expiration date.
An assessment's time value has direct implications to your business
decisions. Simply put, the time frame of your decision MUST match the expiration
date of the tool. For example, if an assessment is deemed "good for a year"
(ie. test-retest
reliability drops off significantly after one year), then you do
not want to hire someone that you hope to have on your team for 10
years using that assessment.
Need another example? Think back to high school. A teacher from your senior
year would never accept a test you took your freshman year. Why? Because, the
teacher wants (and should) see four years of growth. Your knowledge-base changed
over time, so the test that you aced (or failed) your freshman year no longer
applied to you as a senior.
The implications of time value are also financial. Would you rather pay to
assess your team once and have that information apply over the long-term? Or
would you rather pay to reassess your team each year and
be limited to short term decisions? It's like buying a printer for $100 and
paying $75 each year for replacement ink versus buying a printer for $100 with a
lifetime supply of ink. No brainer!
What to look for:
When determining what assessment(s) to use in your business -- and this
applies to ALL aspects of your business whether it be hiring, team building,
team creation, management training, etc -- only use tools with the highest
test-retest reliability ratings available. An accepted range is 0.8 to
0.9; anything over is fantastic, anything below is cautionary. And be
careful: you must check the reliability over time.
Most assessments score a high test-retest reliability rating within a short time
frame...but pay attention to those numbers after 6 months, 1 year and 5 years especially.
You will find that all but a few assessments experience an extreme drop off in
reliability the further out you move on the timeline.
How do you obtain this information?
Just ask for it. Every consultant/assessment company should be more than
willing to furnish a full statistical analysis of their tool(s), which needs
to include a test-retest reliability study as well as case studies,
predictability measurements and validity studies. Ideally, this information is
readily available online. If a simple request or basic Google search do not lead
you to this information...let that be a red flag. I've never found an
organization to hide statistical data that supports its tool.
On a final note, here are links to the test-retest reliability studies of
some of the most widely-used assessments.* The results may (or may not) surprise
you:
*These links lead to conclusions found by the assessment maker and/or from
a trusted, third-party researcher. These resources are a start; there is much
more information/commentary available on this topic for each assessment listed
above and most others.