Are psychometric tests reliable?

The use of psychometric tests in workplaces, particularly in recruitment and selection, has grown rapidly over the past few decades. Businesses are increasingly leaning on them to glean insights into their potential employees and to avoid costly mis-hires. But how do psychometric tests work, what do the results tell you, and most importantly, can you trust the results to aid your hiring decisions? 

What is validity and reliability in psychometric tests?

“One of the key markers of a good psychometric test is high reliability and validity”, says Preeya Patel, a Senior Business Psychologist at Clevry.

“But not all tests have both or either of these, so it’s up to test users to ask test publishers this information to ensure that test quality is to a high standard and that the test can be used in good faith with candidates”, Patel notes.

Reliability refers to the consistency and stability of test scores over time. It ensures the test provides consistent results when administered to the same individual multiple times under similar conditions. A reliable test should yield similar scores for the same individual, regardless of when or where it is administered.

“Ensuring that the test or personality questionnaire accurately measures the psychological attribute being assessed is very important. If a psychometric test lacks reliability, then none of its scores can be considered accurate; therefore, we cannot use its results as a basis for decisions in the workplace”, says Patel.

“Ensuring that the test or personality questionnaire accurately measures the psychological attribute being assessed is very important. If a psychometric test lacks reliability, then none of its scores can be considered accurate”.

There are different types of reliability that researchers and test developers consider when evaluating a psychometric test. Test-retest reliability, for example, involves administering the same test to the same group of individuals on two separate occasions and examining the correlation between the scores. If the scores are highly correlated, the test has good test-retest reliability.

Another type of reliability is internal consistency, which assesses how well the items within a test are measuring the same construct. This is typically measured using statistical techniques such as Cronbach’s alpha. If the items within a test are highly correlated, the test has good internal consistency reliability.

Patel points out a crucial element to consider when assessing the reliability of a test. “Whatever measure of reliability has been used, you should look for the correlation coefficient to be at least 0.75 for ability tests and 0.70 for personality questionnaires to be sure that it is accurate and reliable.”

“This means that, compared to most other forms of workplace assessment, good quality psychometric tests are very accurate and relatively free from error. But even tests with very high levels of reliability are not perfect measures; there is always potential for some error to creep into test scores”, Patel continues

Validity refers to the extent to which a test measures what it claims to measure. It ensures that the test accurately assesses the psychological attribute it intends to measure. A test is considered valid if it accurately captures the construct it claims to measure and provides meaningful and valuable information about the individual being tested.

As with reliability, researchers and test developers evaluate various forms of validity when examining a psychometric test. Content validity, for example, refers to how well the test items represent the measured construct. A test to measure general intelligence should include items that tap into various cognitive abilities, such as verbal reasoning, spatial reasoning, and problem-solving. If the test only focuses on one aspect, it may lack content validity.

Another type of validity is criterion validity, which assesses how well the test scores correlate with an external criterion. For example, if a test is designed to measure job performance, criterion validity would involve comparing the test scores with actual job performance ratings. If the test scores positively correlate with job performance, it suggests the test has criterion validity.

Both validity and reliability play a vital role in determining the overall quality of a psychometric test. A test may be reliable but lacks validity, meaning it consistently measures something but not the intended construct. Conversely, a test may be valid but lacks reliability, meaning it measures the intended construct, but the scores vary widely from one administration to another.

How are psychometric tests scored?

Different methods are employed when scoring psychometric tests depending on the test type. For instance, in multiple-choice tests, such as ability tests, each correct answer is awarded a specific score. The final score is then calculated by summing up the individual scores for each question. This straightforward scoring method allows for a quick and objective evaluation of an individual’s knowledge or cognitive abilities.

However, not all psychometric tests are as straightforward as multiple-choice exams. Personality assessments, for example, employ more complex scoring mechanisms. These tests measure an individual’s personality traits, such as extraversion, agreeableness, emotional stability, and openness to change.

Scoring personality assessments involves evaluating specific traits and providing scores based on predefined categories. One commonly used scoring model is the Likert scale, which involves rating statements on a scale of agreement or disagreement.

“Clevry and many other psychometrics’ scoring describes a candidate’s performance by comparing it with a group of people who have taken the test before. When a candidate completes a test, a raw score is produced – usually the number of questions the candidate has completed correctly, in the case of ability tests”, Patel explains.

“To interpret a particular individual’s raw score, we need to compare it with the scores of a similar group of individuals who have taken the test before – to understand whether our candidate’s performance compares with others’ abilities and tendencies”, she adds.

However, the raw score is not ideal for displaying results. Hence it is not what you see as a test user when looking at the candidate results on a report.

“When we compare an individual’s test score to a group of others (a norm group), the raw score is converted to a different type of score, which indicates the level of the candidate’s performance on the test. Many psychometrics, including Clevry, use ‘Sten’ scoring to support ease of interpretation, providing psychometric results within a 1-10 scoring range. Scores in the middle (4-6) are interpreted within the average range”, Patel explains.

View the output from Clevry assessments by downloading sample reports below.

Are there right and wrong answers in psychometric tests?

The answer is yes and no. For cognitive ability tests, such as numerical tests, there are correct answers that people are scored against. With personality assessments, this is not the case.

“There are no right or wrong answers, no right or wrong personality profile. Individuals are asked to rate on a scale to indicate where their preferences or tendencies lie, i.e., which option best reflects how I think about myself”, Patel says.

How test results should be used to aid hiring decisions?

Psychometric tests are often used in the context of hiring decisions. While they provide valuable insights into an individual’s characteristics and abilities, they should not be the sole basis of hiring decisions.

“When it comes to hiring decisions, best practice is to use a plethora of candidate information to ensure you make as good decisions as possible. Consider what the test results indicate about a candidate’s performance and potential”, says Patel.

Patel notes that psychometrics should be seen as an indicator, not an absolute or crystal ball, of how a person may perform in the role.

“Psychometrics should be seen as an indicator, not an absolute or crystal ball, of how a person may perform in the role.”

One crucial thing to keep in mind whenever hiring new employees is unconscious bias. “People involved in assessing candidates need to be aware of their biases (everybody’s got them!) and make efforts to check their biases and look at candidate performance in assessments as objectively as possible”, Patel says.

What are the limitations of psychometric tests?

While psychometric tests are a powerful tool to indicate job performance and suitability for the role, they do have limitations.

“Psychometrics are not a ‘silver bullet’ pointing you towards the best candidate with 100% accuracy or predictability of performance in the role. Test users need to understand what psychometrics can and cannot do in a talent management process, e.g., when hiring top performers”, Patel points out.

The first step to having a reliable assessment process with results you can trust is to use assessments that are well-researched, validated and sound in science. The second step is to ensure that the test users are trained and understand the nature of the tests and the interpretation of the results.

“All psychometric tests are susceptible to a degree of error in scoring. Well-designed tests and people using tests appropriately support much less error entering into assessment processes”, Patel concludes.


Isn’t it time that your company gets the tools to hire the best?

Get in touch with our sales to learn all about our solutions.


Would you like to have our content delivered to your feed? Follow us in your favorite channel!

Or subscribe to our newsletter
Find your Soft Skills
Let's Go!
Want to check out a sample report to see what Clevry can uncover?