How to Audit Your Resume Screening Software for Hiring Bias - AI resume screening software dashboard showing candidate analysis and matching scores
AI Screening

How to Audit Your Resume Screening Software for Hiring Bias

Alex Thompson
October 15, 2025
15 min read

How to Audit Your Resume Screening Software for Hiring Bias

Here's a wake-up call: 83% of companies now use AI to screen resumes, but most have never audited for bias. NYC's Local Law 144 now requires annual audits with fines up to $1,500 per violation. Workday is facing a collective action lawsuit alleging their screening tool discriminated based on race, age, and disability. University of Washington research found AI favors white-associated names 85% of the time. If you're using screening software without auditing it, you're not just risking compliance—you're perpetuating systematic discrimination. Here's how to fix that.

Auditing resume screening software for hiring bias

Why do I need to audit my screening software—isn't it already fair?

That's exactly what everyone thinks before they audit. Then reality hits.

The research is brutal: University of Washington tested three state-of-the-art LLMs (from Mistral AI, Salesforce, and Contextual AI) across 554 resumes and 571 job descriptions—over 3 million combinations. Results? White-associated names were preferred 85% of the time. Female-associated names only 11%. The AI never once favored Black male-associated names over white male-associated names. Not once.

This isn't buggy legacy software. These are 2024's leading AI systems—and they're systematically biased.

The legal reality: NYC Local Law 144 (effective July 2023) requires annual third-party bias audits for any automated employment decision tool used in NYC. Penalties start at $500 per violation, maxing at $1,500. "Per violation" means per candidate affected—that adds up fast.

The litigation risk: The Mobley v. Workday lawsuit alleges Workday's automated resume screening discriminated based on race, age 40+, and disability. As of May 2025, the court allowed it to proceed as a collective action. That's not an isolated case—it's a preview of what's coming for any company using unaudited screening tools.

The business case: Unilever audited their AI systems and improved processes. Result? 50% increase in women in management roles. Auditing isn't just compliance—it's finding broken processes that cost you top talent.

Bottom line: Your screening software isn't fair by default. It's only fair if you've tested it and proven it.

What exactly should I be testing for in a bias audit?

Bias manifests in specific, measurable ways. Here's what to test:

1. Demographic Selection Rates

What percentage of candidates from each demographic group advance to next stages? If 60% of white candidates pass screening but only 35% of Black candidates with similar qualifications, you've got disparate impact.

NYC Law 144 requires testing:

  • Sex categories (male/female/intersex at minimum)
  • Race/ethnicity categories (at minimum: Hispanic or Latino, White, Black or African American, Native Hawaiian or Pacific Islander, Asian, Native American or Alaska Native, Two or More Races)
  • Intersectional categories (e.g., Black women, Asian men, etc.)

2. Impact Ratios

Compare selection rates across groups. The 4/5ths rule: if any group's selection rate is less than 80% of the highest group's rate, you likely have adverse impact.

Example: If white candidates pass at 70% and Black candidates at 52%, your impact ratio is 52/70 = 0.74 (74%). That's below 80% threshold—you have adverse impact.

3. Name-Based Bias

Test identical resumes with only names changed—"Emily" vs "Lakisha," "Greg" vs "Jamal," "John" vs "Juan." University of Washington research specifically tested this and found massive disparities. Your audit should too.

4. Intersectional Bias

Don't just test gender and race separately. Test Black women vs white women vs Black men vs white men. Research shows bias patterns aren't additive—intersectional identities face unique discrimination.

5. Age Indicators

Does your AI penalize graduation dates from the 1980s-90s? Career length of 20+ years? These proxy for age discrimination (illegal for candidates 40+).

6. Disability Accommodations

Employment gaps for medical reasons? Non-traditional career paths? Workday's lawsuit alleges disability discrimination—test if your tool penalizes these patterns.

7. Education Bias

Does AI favor Ivy League schools? Penalize community colleges or bootcamps? That's socioeconomic bias that correlates with race/ethnicity.

8. Experience Gaps

Career breaks for caregiving (disproportionately affects women)? Does your AI treat 5 years with 1-year gap differently than straight 5 years?

What's the step-by-step process for conducting a bias audit?

Here's the protocol compliant organizations follow:

Phase 1: Preparation (Week 1-2)

Step 1: Inventory your tools. List every software that screens, scores, ranks, or recommends candidates. Yes, even your ATS if it auto-rejects based on keywords. If it "substantially aids" hiring decisions, it counts under NYC law.

Step 2: Collect historical data. Pull 6-12 months of screening data including:

  • All candidates screened (ideally 1000+ for statistical validity)
  • Demographics (you'll need to gather this—more on that later)
  • Screening outcomes (pass/fail, scores, rankings)
  • Final hiring decisions

Step 3: Engage auditor. NYC requires independent third-party auditors. Even if you're not in NYC, external auditors catch issues internal teams miss. Look for auditors experienced in algorithmic bias, familiar with EEOC standards, and who can provide statistical analysis.

Phase 2: Testing (Week 3-6)

Step 4: Comparative resume testing. Create 20-30 pairs of identical resumes where only demographic indicators change:

  • Same qualifications, different names (white/Black/Latino/Asian-associated)
  • Same skills, different gender pronouns
  • Same experience, different university tiers (Ivy vs state school vs community college)
  • Same background, different age indicators (graduation dates)
  • Same achievements, different career patterns (gaps vs continuous)

Run these through your screening software. Document every difference in scores, rankings, or pass/fail outcomes.

Step 5: Calculate selection rates. Using historical data, calculate what percentage of each demographic group passed screening:

  • Male candidates: 420 passed / 600 total = 70%
  • Female candidates: 280 passed / 500 total = 56%
  • Impact ratio: 56% / 70% = 0.80 (exactly at threshold)

Repeat for all protected categories and intersectional combinations.

Step 6: Red team testing. Dedicated team intentionally tries to find bias using edge cases:

  • Candidates with disabilities disclosed
  • Unconventional career paths
  • Non-native English speakers (language patterns)
  • Career changers with transferable skills
  • Candidates from HBCUs or minority-serving institutions

Step 7: Correlation analysis. Test if seemingly neutral factors correlate with protected characteristics:

  • Does zip code correlate with race?
  • Do employment gaps correlate with gender?
  • Does university tier correlate with socioeconomic status/race?

If yes, these are proxy variables perpetuating indirect discrimination.

Phase 3: Documentation (Week 7-8)

Step 8: Document findings. NYC Law 144 requires specific reporting:

  • Date of audit
  • Selection rates by category
  • Impact ratios
  • Distribution date of the tool
  • Summary of results

Even if you're not in NYC, document comprehensively—you'll need this if lawsuits arise.

Step 9: Publish results. NYC requires public disclosure on your careers site. Beyond compliance, transparency builds candidate trust.

Phase 4: Remediation (Ongoing)

Step 10: Fix what's broken. If audit reveals bias:

  • Retrain AI on balanced datasets
  • Remove proxy variables
  • Adjust scoring algorithms
  • Implement fairness constraints
  • Add human review for borderline cases

Step 11: Retest. After fixes, run audit again. Confirm bias is actually reduced, not just moved elsewhere.

Step 12: Schedule annual audits. NYC requires yearly audits. Best practice: audit quarterly internally, annually with third-party.

How do I get demographic data for candidates to even run an audit?

Great question—this trips up many organizations. You can't measure demographic outcomes without demographic data. Here's how to collect it properly:

Self-identification (preferred method):

  • Add optional demographic questions to applications
  • Clearly state: "This information is used only for bias auditing and will not affect hiring decisions"
  • Make it genuinely optional—candidates who decline aren't penalized
  • Keep data separate from resumes/applications during screening
  • Use standard EEOC categories for consistency

Estimation methods (when self-ID insufficient):

  • Use name-based estimation tools (e.g., probabilistic models that estimate ethnicity from names)
  • Combine with other non-invasive signals if available
  • Document methodology transparently
  • Recognize limitations—estimation isn't perfect

What NYC Law 144 allows: The law permits test data where demographic info is "known through self-disclosure" OR "drawn from available data." Translation: estimation is allowed if self-disclosure rates are too low for statistical validity.

Privacy considerations: Store demographic data securely, separate from hiring systems. Access limited to HR analytics and audit teams. Comply with GDPR/CCPA requirements for data handling.

Practical tip: Achieve 60%+ self-identification rates by explaining WHY you're asking (bias auditing, improving fairness) and proving you're serious (share previous audit results showing improvements).

What red flags indicate my screening software is biased?

Watch for these warning signs between formal audits:

Statistical red flags:

  • Diverse candidates consistently score 10-15% lower despite similar qualifications
  • Applicant pool is 40% diverse, but screening passes only 20%
  • Impact ratios below 80% (4/5ths rule)
  • Intersectional groups (e.g., Black women) fare worse than either identity alone

Pattern red flags:

  • Candidates from HBCUs or minority-serving institutions score consistently lower
  • Non-traditional backgrounds (bootcamps, self-taught, career changers) heavily penalized
  • Employment gaps automatically downgraded regardless of context
  • Candidates with 20+ years experience (age 40+ proxy) rejected as "overqualified"
  • Resumes with ethnic names cluster at bottom of rankings

Outcome red flags:

  • Your final hires are 30%+ less diverse than your applicant pool
  • Diverse candidates who pass screening fail interviews at higher rates (suggests screening passes wrong diverse candidates)
  • Diverse hires perform as well as others, but AI scored them lower initially (AI is wrong)

Explainability red flags:

  • Software can't explain why it scored candidates differently
  • Vendor refuses to share training data or methodology
  • Black-box algorithms with no transparency
  • Vendor dismisses bias concerns without evidence

Candidate feedback red flags:

  • Multiple diverse candidates report feeling unfairly rejected
  • Candidates with strong qualifications confused about rejections
  • Pattern of complaints from specific demographics

If you see 2-3 of these, audit immediately. If you see 5+, stop using the tool until audited and fixed.

What questions should I ask my screening software vendor?

Don't just trust vendor claims. Ask these specific questions:

About bias testing:

  • "Have you conducted third-party bias audits? Can we see results?"
  • "What were the selection rates across demographic groups in your testing?"
  • "What impact ratios did your tool achieve in bias testing?"
  • "Have you tested for intersectional bias, not just single demographics?"
  • "How often do you retest for bias as your AI learns?"

About training data:

  • "What data was your AI trained on?"
  • "How diverse was your training dataset across race, gender, age, education?"
  • "Did you remove biased data or reweight for balance?"
  • "How do you prevent your AI from learning biased patterns from customer data?"

About fairness safeguards:

  • "What fairness-aware algorithms do you use?"
  • "Can your AI explain scoring decisions?"
  • "What demographic information does your AI use? How do you prevent it from using proxies?"
  • "Do you monitor demographic outcomes in production?"

About compliance:

  • "Do you comply with NYC Local Law 144?"
  • "Will you support us in our annual bias audits?"
  • "What happens if our audit reveals bias in your tool?"
  • "Do you carry insurance for discrimination claims?"

About customization:

  • "Can we audit your tool with our own test resumes?"
  • "Can we adjust scoring criteria to reduce bias we identify?"
  • "Do you provide demographic pass-through dashboards?"
  • "Can we implement human review checkpoints?"

Red flag responses:

  • "Our AI is unbiased" (impossible claim)
  • "That's proprietary" (refusing basic transparency)
  • "We can't share audit results" (what are they hiding?)
  • "Bias testing isn't necessary for our tool" (yes it is)
  • "You'll have to trust us" (absolutely not)

Ethical vendors welcome these questions. Evasive vendors are selling risky products.

What do I do if my audit reveals significant bias?

Don't panic—but do act immediately. Here's your response protocol:

Immediate actions (Days 1-3):

1. Stop using the tool for new candidates. Don't discriminate against one more person while you fix this. Manual screening temporarily if needed.

2. Notify leadership and legal. This is a legal compliance and reputational risk issue. They need to know now, not when lawsuits arrive.

3. Document everything. Your audit findings, when you discovered bias, what actions you took, and when. This documentation defends you if litigation arises.

4. Contact your vendor. Share findings. Demand explanations and fixes. If they're unresponsive or defensive, that's a sign to switch vendors.

Short-term remediation (Weeks 1-4):

5. Review past decisions. How many candidates were potentially harmed? Consider re-reviewing rejected diverse candidates from past 6-12 months with manual screening.

6. Implement compensating controls:

  • Add human review for all diverse candidates in borderline scoring ranges
  • Remove most discriminatory criteria (e.g., university name, zip code)
  • Adjust scoring weights to reduce bias impact
  • Implement blind screening to remove demographic signals

7. Retrain or reconfigure. Work with vendor to:

  • Retrain AI on balanced datasets
  • Remove proxy variables
  • Implement fairness constraints
  • Add explainability features

Long-term fixes (Months 2-6):

8. Re-audit after changes. Don't assume fixes worked. Test again with same methodology. Compare results to initial audit.

9. Consider switching vendors. If your vendor can't or won't fix bias, you need a different tool. Modern AI-powered recruitment platforms are built with bias mitigation from the ground up.

10. Implement ongoing monitoring. Monthly demographic pass-through tracking. Automated alerts when metrics deviate. Quarterly internal audits. Annual third-party audits.

11. Train your team. Recruiters and hiring managers need to understand bias patterns, how to spot them, and how to escalate concerns.

12. Update processes. Combine AI screening with structured interviews, diverse panels, and competency-based evaluation to catch what AI misses.

The accountability rule: Someone senior must own bias mitigation. Not just compliance checkbox—actual accountability with authority to pause hiring if bias reemerges.

How often should I audit, and can I do it internally?

NYC Law 144 requires annual audits by independent third parties. That's the legal minimum. Here's the practical best practice:

Annual: Full third-party audit

  • Comprehensive statistical analysis
  • Comparative resume testing
  • Red team simulations
  • Legal compliance review
  • Public reporting

Why third-party? External auditors have no incentive to hide problems. They bring expertise. They provide legal defensibility. NYC requires it, and even if you're not in NYC, external audits are more credible to regulators and courts.

Quarterly: Internal monitoring

  • Review demographic pass-through rates
  • Check impact ratios
  • Spot-check screening decisions
  • Track candidate feedback
  • Monitor for new bias patterns

Why internal? Catches problems early before they compound. Faster, cheaper, more frequent. Keeps bias top-of-mind for teams.

Monthly: Dashboard review

  • Automated demographic metrics
  • Pass/fail rates by group
  • Trending analysis
  • Alert on deviations

Why monthly? Real-time visibility. Immediate intervention if bias spikes. Demonstrates ongoing diligence.

Ad-hoc: Trigger-based audits

Audit immediately when:

  • Vendor releases major software update
  • You change screening criteria or add roles
  • Demographic metrics suddenly shift
  • Candidate complaints spike
  • New regulations take effect
  • Before expanding to new jurisdictions

Can you audit internally? For ongoing monitoring, yes. For compliance, depends on location. NYC requires independent third parties. Other jurisdictions may allow internal, but external audits are more defensible legally and more credible to stakeholders.

Cost perspective: Third-party audits cost $15,000-$50,000 depending on scope. Sounds expensive until you compare to:

  • NYC fines: $500-$1,500 per violation (thousands of candidates = massive fines)
  • Discrimination lawsuits: $50,000-$500,000+ in settlements
  • Reputational damage: priceless (in the bad way)
  • Lost diverse talent: competitive disadvantage

Auditing isn't an expense—it's insurance against far larger costs.

What's coming next—how will audit requirements evolve?

NYC set the standard. Others are following. Here's what's ahead:

Regulatory expansion:

  • More states considering AI hiring laws (California, Colorado, New Jersey actively drafting)
  • Federal legislation proposed (AI Accountability Act would require impact assessments)
  • EU AI Act includes high-risk classification for hiring systems
  • Expect audit requirements to become standard nationwide by 2026-2027

Stricter standards:

  • Intersectional bias testing becoming mandatory (not just race, gender separately)
  • Continuous monitoring requirements (not just annual snapshots)
  • Candidate access to AI decisions (explainability requirements)
  • Vendor liability (holding software makers accountable, not just users)

Industry certification:

  • Third-party "Bias-Free Certified" seals emerging
  • Professional standards for AI auditors developing
  • Industry benchmarks for acceptable impact ratios
  • Transparency becoming competitive differentiator

Technology improvements:

  • Real-time bias detection built into screening tools
  • Automated demographic monitoring dashboards
  • AI that can audit other AI systems
  • Explainability becoming standard, not optional

Litigation trends:

  • More collective action lawsuits like Workday case
  • EEOC increasing AI discrimination investigations
  • Class actions covering thousands of affected candidates
  • Settlements in millions, not thousands

The smart play: Get ahead of regulations now. Organizations auditing proactively will adapt easily to new requirements. Those waiting for mandates will scramble, make mistakes, and face penalties.

What's your audit action plan?

Here's your 90-day implementation roadmap:

Month 1: Foundation

  • Week 1: Inventory all screening tools and current usage
  • Week 2: Collect 6-12 months historical screening data
  • Week 3: Add demographic self-ID questions to applications
  • Week 4: Engage third-party auditor, schedule audit

Month 2: Testing

  • Week 5-6: Create and run comparative resume tests
  • Week 7: Calculate selection rates and impact ratios
  • Week 8: Conduct red team simulations and document findings

Month 3: Action

  • Week 9: Receive and review audit report
  • Week 10: Implement immediate fixes for critical biases
  • Week 11: Publish results, notify stakeholders
  • Week 12: Set up ongoing monitoring dashboards, schedule quarterly reviews

Ongoing: Sustain

  • Monthly metrics reviews
  • Quarterly internal audits
  • Annual third-party audits
  • Immediate investigation of any red flags

Remember: 83% of companies use AI screening. Most haven't audited. That makes them vulnerable—to lawsuits, fines, reputational damage, and lost talent. Don't be in that group.

Ready to audit your screening software properly? Modern recruitment platforms offer built-in bias detection, demographic monitoring, and audit-ready reporting. The tools exist. The regulations are here. The lawsuits are coming. The only question is whether you'll audit proactively or wait for problems to force your hand.

Because here's the thing: you can't fix bias you haven't measured. And you can't measure bias you're not auditing. Start auditing.

Ready to experience the power of AI-driven recruitment? Try our free AI resume screening software and see how it can transform your hiring process.

Join thousands of recruiters using the best AI hiring tool to screen candidates 10x faster with 100% accuracy.