Should we use AI to screen and shortlist job candidates?
Question
Should we use AI to screen and shortlist job candidates?
The pitch is simple: thousands of applications, a small team, and software that promises to read every CV in seconds, score them, and hand you a shortlist — faster and, the vendors say, more consistent than a tired human. For a leader the real decision is three-way: let an AI tool filter and reject candidates automatically, use AI only to assist a person who still makes the call, or keep screening human. The gap between those three is where the risk lives.
One honest caveat up front. Most of the credible, independent evidence on AI hiring is not about whether it saves time — it plainly can — but about what it gets wrong and who it harms. So this brief weighs the decision on those terms: bias, oversight, candidate experience, and the law. The efficiency is real; it is also the easy part.
Evidence
The bias is documented and concrete, not hypothetical. As an open-licensed analysis in The Conversation (Rafi, 2024) sets out, a widely used video-interview tool "favoured certain facial expressions, speaking styles and tones of voice, disproportionately disadvantaging minority candidates" — drawing a 2019 US federal complaint — and Amazon's experimental recruiting engine, trained on a decade of male-dominated résumés, "favoured male candidates by downgrading applications that included the word 'women's' and penalizing graduates of women's colleges." Facial-analysis tools show "higher error rates for racialized individuals, particularly racialized women, because they are underrepresented in the data used to train these systems." The pattern is consistent: an AI trained on the past reproduces the past.
The mechanism is that AI learns proxies for the very traits you're not allowed to select on. A second open-licensed Conversation piece (Kelan, 2023) explains that a system can quietly learn to prefer a candidate "named 'Mark' over 'Mary'" — picking up spurious correlations that "reflect rather than correct societal inequities." The discrimination doesn't need a protected field in the data; the model infers it from names, postcodes, gaps, hobbies, and phrasing.
The standard safeguard — "don't worry, a human reviews the AI's picks" — mostly doesn't work, and there's an experiment to prove it. In a controlled study in Frontiers in Psychology (Rosenthal-von der Pütten & Sach, 2024), an automated decision-support system applied a systematic 10-point penalty to Turkish-named applicants — and only 41% of participants explicitly noticed the discrimination. Most people deferred to the biased scores without registering that anything was wrong. This is the single most important finding for the decision: "keep a human in the loop" is not a safeguard if the human can't see the bias — and usually they can't.
Candidates notice they're being judged by a machine, and it costs you. In another Frontiers in Psychology study (Schick & Fischer, 2021), most respondents (54.1%) felt poorly or very poorly prepared for AI in selection even as 59.4% expected it to become common, and they rated AI evaluating personality as the lowest-quality assessment objective of those tested. A 2025 Frontiers in Artificial Intelligence study (Malin et al., N = 921) found AI rejection without an explanation scored lowest on every fairness dimension — but that a clear explanation closed the gap, matching or beating un-explained human decisions. Translation: an opaque automated rejection is the worst of both worlds for your employer brand; transparency is not optional.
The honest counter-weight: used well, AI can reduce some human bias. Kelan (2023) is explicit that with representative data and genuine oversight, AI-supported hiring "could make hiring more inclusive" — but only alongside "impact assessments and AI audits that check systematically for discriminatory effects," which she calls "crucial." That conditional is the whole game: the upside is real and contingent on work most buyers skip.
Disagreement
| View | The claim | Where it holds — and breaks |
|---|---|---|
| "Adopt AI screening — it's faster, cheaper and more consistent than humans" | At scale, software reads every CV the same way and frees the team for higher-value work. | Holds on throughput and consistency — including consistency of a bias, applied to every applicant at once. Breaks as a safety claim: the documented failure mode is discrimination that humans demonstrably fail to catch (41% noticed), and an opaque auto-reject damages candidate trust. Speed is not the same as sound. |
| "AI removes human bias — the algorithm is objective" | A model has no gut feeling, no favouritism, no bad day. | Holds only with representative data, systematic audits, and oversight. Breaks by default: models inherit the bias in their training data and learn proxies for protected traits, so an un-audited tool automates discrimination rather than removing it. "Objective" describes the math, not the outcome. |
The real split isn't AI vs. no AI. It's AI as an assistant to a structured human decision vs. AI as the decision-maker: software that parses, organises and surfaces — with a person applying job-relevant criteria and able to contest it — versus software that scores and silently rejects. The first is defensible; the second is a discrimination-at-scale and legal-exposure risk wearing an efficiency badge.
Peoplense Verdict
Use AI to assist a structured human decision — never to auto-reject — and only with auditing, explainability and disclosure in place. The efficiency is genuine; the documented harms are too, and the usual reassurances ("a human checks it," "the algorithm is objective") don't survive contact with the evidence.
- What to rely on: AI for the mechanical work — parsing CVs, extracting skills, de-duplicating, organising a pile so a person can review job-relevant evidence faster. Keep a human making the accept/reject call against explicit, job-relevant criteria.
- What to avoid: letting a tool automatically reject or rank-and-cut candidates; AI that scores personality or "culture fit" from video or text; and any "black box" you can't explain to a rejected applicant or a regulator. "There's a human in the loop" is not a control unless that human is actually equipped to detect bias.
- The point that matters: the danger isn't the robot's malice, it's its obedience — it will apply whatever bias is in the data, consistently, to everyone, and the people reviewing it mostly won't notice. Your job is to make the bias visible (audits) and the decision contestable (a human, an explanation), or don't deploy it.
What to do today
- Draw one bright line: AI never auto-rejects. It can sort, summarise and surface; a person makes every accept/reject and shortlist call. This single rule removes most of the legal and ethical risk.
- Audit for adverse impact — because your reviewers won't spot it. Before and after deployment, test selection rates across gender, nationality and age. If the tool advantages one group, that's a defect to fix, not a quirk to tolerate. (Only 41% of people noticed a deliberate, sizeable bias — assume yours is invisible until measured.)
- Demand explainability from any vendor. If they can't tell you which features drove a score and let you reproduce a decision, you can't defend it to a candidate or a regulator — walk away.
- Tell candidates, and give them a human to appeal to. Disclose that AI is used, and route any rejection to a contestation channel. A clear explanation measurably improves perceived fairness; silence does the opposite.
- Treat data-protection duties as the floor, not the ceiling. Establish a lawful basis for processing candidate data, run a data-protection impact assessment for high-stakes automated screening, and keep meaningful human oversight on consequential decisions.
GCC Relevance
There is a real Gulf angle — partly about volume, and squarely about law.
The volume pressure that makes AI screening tempting is acute in the Gulf. Saudisation targets under the Nitaqat programme and large, young applicant pools mean recruiters in KSA and across the GCC face exactly the high-throughput problem these tools are sold against — which makes the temptation to let AI decide (not just assist) stronger here, and the bias risk correspondingly more consequential where nationalisation and diversity goals ride on who gets hired.
Saudi Arabia's Personal Data Protection Law (PDPL) is the binding constraint, and it reaches your vendor. The PDPL, administered by SDAIA, governs the personal data of individuals in the Kingdom regardless of where the processor sits — so a foreign Ai-screening vendor does not put the data outside its scope. Consequential, solely-automated decisions trigger expectations around disclosure, meaningful human oversight, lawful basis, cross-border-transfer rules, and a data-protection impact assessment for high-risk processing. Confirm the exact obligations against the current PDPL and SDAIA guidance before deploying — but the direction is clear: an opaque, auto-rejecting tool is a compliance problem, not just an ethics one.
Honest scope: the bias, oversight and candidate-experience evidence here is non-Gulf (European and North American samples) — we found no Gulf-specific peer-reviewed open-licensed study on AI hiring. The KSA hook is the PDPL/SDAIA legal framework and the Nitaqat volume context; treat the regulatory specifics as pointers to verify against the primary law, not as legal advice.
Sources
Library / open-licensed sources (Creative Commons; quoted from the pages themselves):
- Rafi, M. (2024), When AI plays favourites: how algorithmic bias shapes the hiring process, The Conversation — original · licence: CC BY-ND 4.0. Documented cases: video-interview tools disadvantaging minority candidates (2019 federal complaint); Amazon's tool downgrading "women's" CVs; higher facial-analysis error rates for racialized women.
- Kelan, E. (2023), AI can reinforce discrimination — but used correctly it could make hiring more inclusive, The Conversation — original · licence: CC BY-ND 4.0. AI learns proxies (preferring "Mark" over "Mary"); impact assessments and systematic AI audits are "crucial"; with diverse data + oversight AI can aid inclusion.
- Rosenthal-von der Pütten, A. M. & Sach, A. (2024), Michael is better than Mehmet: exploring the perils of algorithmic biases and selective adherence to advice from automated decision support systems in hiring, Frontiers in Psychology — original · licence: CC BY. Only 41% of participants noticed a systematic 10-point algorithmic penalty against Turkish-named applicants — the evidence that human oversight often fails to catch algorithmic bias.
- Schick, J. & Fischer, S. (2021), Dear Computer on My Desk, Which Candidate Fits Best? Candidates' Perception of Assessment Quality When Using AI in Personnel Selection, Frontiers in Psychology — original · licence: CC BY. 54.1% felt poorly prepared for AI selection; AI evaluating personality rated the lowest-quality assessment objective.
- Malin, C., Fleiß, J., Ortlieb, R. & Thalmann, S. (2025), Rejected by an AI? Comparing job applicants' fairness perceptions of artificial intelligence and humans in personnel selection, Frontiers in Artificial Intelligence — original · licence: CC BY. N = 921; AI rejection without explanation scored lowest on every fairness dimension; a clear explanation closed the gap with human decisions.
Cited findings (named and linked, not republished — these do not carry an open licence):
- Amazon's scrapped AI recruiting tool (reported by Reuters, 2018) — the canonical cautionary case: a model trained on male-dominated résumés learned to penalise the word "women's." Cite-only; also referenced inside the Conversation sources above.
- EU Artificial Intelligence Act (Regulation (EU) 2024/1689), Annex III — classifies AI used for recruitment and candidate evaluation as high-risk, triggering transparency, risk-management and human-oversight obligations. EUR-Lex (official EU text — not CC; cite-only).
- Saudi Personal Data Protection Law (PDPL) — administered by SDAIA; governs candidate data of individuals in the Kingdom regardless of processor location, with duties around automated decisions, lawful basis, cross-border transfer and impact assessments. Verify specifics against the official SDAIA / PDPL text before publishing. (Primary legal source — not CC; cite-only.)
Get the Monday Brief
Evidence-based people development research, summarized weekly. Free. No ads. Every article links to its source.
Email used only to deliver the brief. Unsubscribe anytime.
Want to join the editorial team?
We're building this slowly, with people-development practitioners across the GCC. If you'd like to help shape what gets published, tell us about yourself.
Tell us about yourself