True Assess | Secure Your Hiring Process

This article benchmarks the potential disruption that tools like ChatGPT can cause to assessments. It includes us solving sample questions from various types of pre-employment assessments to determine this. For an overview of why generative AI tools, including ChatGPT, pose a greater risk to deterministic assessment results than anything before, please refer to our article here.

The answer to this question is, unfortunately, not straightforward. The risk that each online assessment platform poses to your hiring process, in terms of vulnerability to deception by candidates, varies. However, the factors that quantify this risk are consistent across different platforms.

So what are these factors?

Anti-Cheating Measures

The most evident factor is the presence and effectiveness of anti-cheating measures. These measures may include human proctoring, time limits, randomized questions, and browser lockdowns. Platforms with robust anti-cheating protocols generally have a lower susceptibility to manipulation by AI tools like ChatGPT. However, the mere presence of these measures does not guarantee their effectiveness in preventing cheating. The critical aspect here is how these measures integrate and complement each other to eliminate exploitable gaps. Unfortunately, many current assessment platforms have implemented various measures without thoroughly evaluating their collective effectiveness, especially in the context of generative AI.

Type of Test

The nature of the test itself significantly influences its susceptibility to cheating. Some tests are inherently more vulnerable than others, such as:

Multiple Choice Questions (MCQs): These tests, often based on simple recall or pattern recognition, are relatively easy for AI to manage.
Case Studies: These require a complex chain of thought and in-depth analysis, making them more challenging for AI to solve accurately.
Coding Tests: These tests, which require candidates to write and debug code, vary in risk depending on the complexity of the problem and the programming language used. ChatGPT, however, excels at solving coding problems, making these tests particularly vulnerable.
Aptitude Tests: Tests that assess logical reasoning, numerical ability, verbal skills, and pattern recognition are highly susceptible to gaming. ChatGPT demonstrates high proficiency in these areas, often scoring over 90%.

Variability of Test Content

The diversity and range of the test bank can influence vulnerability. Theoretically, a larger and more varied test bank should complicate the ability of candidates to predict or memorize answers. However, with the rise of large language models like ChatGPT, which can generate pertinent responses based on inputs, this factor has become less significant. Even slight modifications to a question type may not substantially affect ChatGPT's ability to provide the correct answer.

Benchmarking how good ChatGPT is at solving different types of assessments

Instead of merely discussing how effective ChatGPT is at solving various assessments, a more illustrative approach would involve demonstrating its capabilities using examples from the test types mentioned above. We will focus specifically on the accuracy and speed of ChatGPT's responses, as these are the primary factors determining assessment scores.

Upon watching the video and recognizing how effectively ChatGPT can help dishonest candidates stand out, the situation may appear dire. Regrettably, it becomes even more concerning. The only way to address this issue at present is to rely on online platforms to maintain strict measures against cheating. However, since AI was not previously a major concern, few platforms have implemented defenses against such strategies. Furthermore, as awareness of the potential of generative AI to manipulate outcomes increases, candidates are making more efforts to find ways to exploit them during evaluations. Currently, individuals with basic computer knowledge could undermine these tests if they wanted to.

Video demonstration of a compromised cognitive aptitude assessment

The above image depicts the gravity of the situation. It shows the result of a cognitive aptitude test, conducted by one of the leading online cognitive aptitude test platforms in the world, attempted by one of our consultants using generative AI. This was done following the same guidelines showcased in the video above and the results proclaim the test-taker as having a cognitive aptitude score higher than 99% of all native English college graduates. The results achieved with generative AI not only hamper the hiring process but also actively bias it towards candidates using these methods, as they would be classified as the best of the best by the platform i.e. in this case the candidate as termed as having scored more than 99% of all native English college gradudates.

"The results achieved with generative AI not only hamper the hiring process but also actively bias it towards candidates using these methods, as they would be classified as the best of the best by the platform."

Moving forward

Given that ChatGPT is alarmingly accurate at assisting candidates with pre-employment assessments, and many platforms are compromised by this reality, what are the next steps? Do we take a huge step backward and completely remove these platforms from our processes? No, on the contrary, these platforms still serve the purpose of narrowing down candidates and helping assess their skills but only if they are not susceptible to the vulnerabilities discussed above.

What companies need to do is seek expert opinions on the platforms they use, identify their vulnerabilities, deliberate on whether better alternatives are available, and understand how these weaknesses could affect their hiring process. Moreover, as malicious candidates are likely to exploit every stage of the hiring process using AI, placing all responsibility on just the assessment phase can provide a false sense of security and render the process less deterministic. It is crucial to comprehend the overall weaknesses in your hiring process. True Assess helps you identify the risks posed by your chosen assessment platform to your hiring process and provides tailored solutions to mitigate these risks, backed by a team with a proven track record in fortifying hiring practices against deception.

Addendum

The following links contain the chat history with both the questions that were asked to the Generative AI and the responses received:

Cognitive Aptitude Test: https://chatgpt.com/share/c362b7de-25bf-480b-9927-b0e5c0ae6ae2
Code Based Test: https://chatgpt.com/share/a617e756-2c48-4b40-9ea3-8abc6c474cd8
Domain Specific Test: https://chatgpt.com/share/cab02ce5-5959-4f20-b381-3e74a471936f

Ready to secure your hiring process?

Don't let vulnerable assessment platforms compromise your hiring decisions. Our team of experts can help you identify and address vulnerabilities in your current assessment tools and processes.