We helped write the sequel to "Cracking the Coding Interview". Read 9 chapters for free

Why AI can’t do hiring

By Aline Lerner | Published:

The recent exciting and somewhat horrifying inflection point in AI capability, which many of us got to experience firsthand when playing with OpenAI’s ChatGPT, tipped me into finally writing this post.

I’m the founder of interviewing.io, a mock interview platform and eng hiring marketplace. Engineers use us for mock interviews, and we use the data from those interviews to surface top performers, in a much fairer and more predictive way than a resume. If you’re a top performer on interviewing.io, we fast-track you at the world’s best companies.

We’re venture backed and have raised 4 rounds of funding in the last 7 years, totaling over $15M, which means that I’ve done a lot of VC pitches. I don’t know how many exactly (and a lady should never tell), but it’s in the hundreds. Once you’ve done that many pitches, you start to hear the same feedback over and over. They range from questions about whether the eng hiring is big enough (it is) to how objections about human-on-human interviews don’t scale (if 2 humans doing a thing together didn’t scale, our species would be extinct) to polite suggestions about how we’d be a much more attractive investment if we used ML/AI to match candidates to companies.

I’ve heard the latter a LOT over the years, but despite the well-intentioned advice, I’m convinced that building an AI matcher is a fool’s errand. My argument is this: It’s not that AI doing hiring is technically impossible – ChatGPT has shown us that the ceiling on what’s possible is higher than many of us had ever imagined – but that it’s impossible because you don’t have the data. In other words, the hard part about hiring isn’t the tech. It’s having the data to make good hiring decisions in the first place.

For the purposes of this piece, I define “hiring” as being able to find candidates for specific roles and fill those roles. I am NOT referring to automating tasks like resume parsing, writing sourcing emails, or scheduling, i.e., tasks that human recruiters, sourcers, and coordinators do as part of their job. Surely those can be automated. The more interesting question is whether an AI can do the job of a recruiter better than a human.¹ In other words, can it take a list of candidates, a list of job descriptions, and then use publicly available (NOT proprietary) data to match them up successfully and fill roles? After all, the reality is that neither recruiters nor burgeoning AI recruiting startups have a proprietary data set to work with. They usually have job descriptions, LinkedIn Recruiter (the granular search functionality of which isn’t publicly accessible… but the LinkedIn profiles of candidates actually are), and whatever else they can find on the internet.

To wit, this post isn’t about how AI can’t be used for hiring if you have all the data. Rather, it’s about how you can’t get access to all the data you’d need to do hiring, thereby making the training of an AI impossible.

This post also isn't about how human are great at hiring. There's nothing special about our humanity when it comes to recruiting, and humans actually suck at hiring for the same reasons as AI — we don't have the data either.

A few caveats: the case for Microsoft and the question of bias

At this point, you’re probably thinking, “Well, surely Microsoft can do this, given that they own both LinkedIn and GitHub.” In this post, you’ll see why LinkedIn and GitHub are not enough. Perhaps if Microsoft chose to buy a bunch of applicant tracking systems (ATSs) in order to get access to interview performance data, coupled with data from LinkedIn and GitHub, they’d have a fighting chance, but honestly that’s probably not enough either.

Moreover, the tenuous Microsoft edge aside, the reality is that most of us do NOT have access to the kind of training data we’d need, but we still see startup after startup claiming to do AI hiring in their marketing materials.

Finally, before we get into why AI can’t do hiring, I want to call out the important question of bias that results from training AI on hiring data where decisions were previously made by humans. To keep this (already long) post on task, I will not touch on the subject of the bias. To be sure, it’s a real problem, and there’s already a lot of good writing on the subject.

That caveat aside, if we’re trying to build a solution that takes candidates and jobs as inputs and produces a set of matches as output, let’s start by considering what that matcher does and how it’s trained.

How do we train an AI matcher?

Let’s pretend for a moment that we have built the platonic ideal of an AI matcher. It takes 2 inputs: a list of candidates and a list of companies, and a sorted list of company/role matches come out, like so:

To train this matcher, we need 3 distinct pieces of data:

Publicly available job descriptions from a bunch of tech companies
Publicly available data about engineers, i.e., LinkedIn profiles, GitHub profiles & contributions, and engineers’ social graph across a bunch of different platforms
A list of successful company/candidate matches, taken from the public domain, e.g., from scraping LinkedIn to see where people worked/for how long, and cross-referencing that with (1) and (2).

You may notice that in all 3 data requirements above, I called out that they’re publicly available. That might seem odd at first — after all, if you’re starting a company that’s building this matcher, your secret sauce might be your proprietary data about candidates or companies or both, and you might first run a different business model, completely divorced from AI, to collect this data, at which point, boom, you flip a switch, and all of a sudden you’re an AI company.

That’s a good strategy, but it’s actually really hard to acquire detailed, proprietary data about candidates, companies, or how well people do at companies once they’re hired, let alone all 3 at once. Most startups that try to build an AI matcher don’t start with a bunch of proprietary data. Rather, they start with the public domain. The thesis of this piece is that getting the data is the hard part, not the AI, so to reason through it, let’s assume that we have the AI already but that the only data we have is publicly available.

So now that we have all this training data, let’s get to work. We’ll go through the set of successful matches and then find more data about those candidates and those companies and see which traits carry the most signal for a good match.

We now have a trained and working matcher. So far so good. But wait, not so fast!

What is hiring, really?

Let’s switch gears and forget about our matcher for a moment. Broadly speaking, regardless of how we get there, what needs to be true for someone to be a good fit for a job? There are three things:

Good: They’re good enough to do the job
Looking: They’re open to taking a new job
Interested: They’re interested in the job/company

Let’s succinctly call those “good, looking, and interested”. These three criteria are necessary to make a hire.

The first two items are largely independent of the company. The third is about candidate/company fit, and we’ll come back to it when we talk about matching. Before we do that, though, can we actually deduce which engineers are good and looking?

I would argue that we can’t. No matter how the matcher ended up getting trained or what patterns or artifacts it detected and assigned value to, the data to tell whether someone is a good engineer (even if the definition is elastic, depending on a given company’s “bar”) simply does not exist in the public domain.** An AI is very good at finding patterns in existing data. It is not good at magicking data out of thin air. That means that before we even get to the question of matching, we’re dead in the water.** Let me try to convince you.

“Good”

Generally, you have 3 pieces of public data available to you for a given engineer:

Their public-facing LinkedIn
Their public-facing GitHub
Their public-facing social graph

Surely you can tell if someone is a good engineer from some combo of these 3?

Let’s look at each of the 3 data sources above, starting with LinkedIn. What data is available on engineers’ public-facing LinkedIns? Given that a LinkedIn profile is a glorified resume, it’s usually these 3 things:

Where they’ve previously worked
Where they went to school
Any certifications/endorsements/skills that they have

I run a hiring marketplace, and having any kind of edge in predicting which of our users are good is material to our business, so we’ve spent a good amount of time and effort trying to tie these attributes to how good an engineer is. As it turns out, an engineer’s employment history carries some signal, school carries very little to none, and LinkedIn certifications and endorsements carry a negative signal.²

Knowing which programming languages or frameworks an engineer knows can be useful, but most eng roles are either language/framework agnostic OR the language/framework is not the most significant bit when determining whether an engineer is a good fit or not — knowing the language is usually a nice to have, but it won’t get you hired if you’re not a good coder, first and foremost.

Often, in the absence of being able to search for whether an engineer is good, recruiters will search for programming languages as a proxy for fit, but that’s all it is, a weak proxy.

Basically, if you’re relying on a combo of school, work history, and specific skills, you’re doing exactly the same work that armies of recruiters have been doing for decades, with very limited success. It’s really easy to search LinkedIn for past employers, schools, and skills/endorsements. They’ve built a whole business on it. However, humans are terrible at predicting engineer quality from resumes — only about as good as a coinflip. And it’s not because humans are bad and that an AI would do better. It’s because there’s minimal signal in a resume in the first place, and try hard as you might, extracting signal in this case is like squeezing water from a stone. Having a robot hand will not save you.

AI or not, we’re in a hard market. Keep your skills sharp. Sign up for anonymous mock interviews with engineers from top companies.

See available times

GitHub

GitHub is an interesting one. Surely if you have access to a bunch of code someone has written, you can suss out if they’re a good engineer. The GitHub approach is also appealing because it’s much more meritocratic than a resume — your good code can stand on its own, regardless of who you are or where you come from. Wouldn’t it be great if GitHub could help you surface the odd diamond in the rough or an upstart with no job experience?

This is of course, the ethical question of whether GitHub should be used for hiring in the first place and whether it’s fair to create a bias toward hobbyist engineers and/or open-source contributors.³ For the purposes of this piece, I will put aside that question and focus just on whether GitHub can be used for hiring.

The short answer is that it can’t because most engineers don’t have public commits. Senior engineers at large tech companies don’t work on open-source projects for the most part. This is why programmers on Reddit laugh at the idea of screening out candidates with unused GitHub accounts. Bjarne Stroustrup, the inventor of C++, would look unimpressive to an algorithm obsessed with GitHub activity.

This same message is clear from data on GitHub’s users. The website reported 100 million active developers in early 2023, but data on all commits in 2022 shows that only about 11 million users had any public code commits and only 8 million had more than two commits. That means that only 8-10% of engineers actually have publicly available GitHub data that our AI could use, in the absolute best case, and this counts millions of repositories with names like “hello world” and “test.”

Ben Frederickson did some of his digging about the utility of GitHub in hiring and published a stellar, highly detailed report in 2018. According to Frederickson, only 1.4% of GitHub users pushed more than 100 times, and only 0.15% of GitHub users pushed more than 500 times. Frederickson’s findings from 2018 roughly corroborate ours from late 2022, and in both cases, there is a clear power law – most of the commits are being done by a tiny fraction of GitHub’s users.

A paper called “The Promises and Perils of Mining GitHub”, published in 2014, looked at GitHub activity as well, through the lens of projects rather than users. They found a similar power-law relationship – the most active 2.5% of projects account for the same number of commits as the remaining 97.5% projects. Moreover, this paper found that about 37% of projects on GitHub were not being used for software development, and the rest were used for other purposes (e.g., storage).

Given these limitations, the reality is that the portion of GitHub accounts that could actually be useful for hiring is likely under 1%.

Social Graph

Finally, we have the social graph. The hypothesis here is that great engineers follow other great engineers on platforms like GitHub and Twitter, so if we can identify a set of great engineers somehow, and dig deeply enough through the tangled web of whom they follow and who follows them, we’ll be able to create a reliable talent map.

Let’s assume for a moment that we can seed this approach with enough great engineers (e.g., by scraping the GitHub profiles that do have enough code to extract signal). What happens next?

To get a feel for the “following” behavior among software engineers and whether they tend to follow the best people they work with, we surveyed our user base, like so:

Based on almost 1000 responses from our engineers (average years of experience = 8, median = 7), the social graph approach doesn’t hold water. First, engineers that we surveyed only rarely followed their most impressive colleagues on GitHub or Twitter. The average engineer reported following just one of their top five coworkers in these sites. The social graph was just not that connected to people’s perceptions of talent.

What about LinkedIn? Although many of our users did indeed follow the best engineers they’ve ever worked with, they also followed everyone else: the majority of engineers we surveyed reported that their connections had no rhyme or reason, or they just connected with anyone who tried to connect with them.

These muddle the graph-based approaches. Twitter and GitHub follows are still too uncommon to be a reliable signal. And LinkedIn connections capture a haphazard mix of networking approaches — the most dominant of which is either random or just driven by workplace, which any recruiter would have known in the first place.

“Looking”

Let’s assume for a moment we can figure out who the good engineers are (or rate them on some kind of scale, at least). Next, we have to figure out which of these engineers are on the market right now.

This is arguably an easier problem to solve than whether an engineer is good because publicly available cues do exist. That said, a number of startups have tried to tackle this problem (most notably Entelo, with their Sonar product), though so far none have been particularly successful. The kinds of inputs that typically go into trying to figure out if an engineer is active are split between candidate-level attributes (e.g., how recently they updated their LinkedIn) and company-level attributes (e.g., how long since the company’s last round of funding). Here’s what that list could look like:

Candidate-level attributes:

When did they last update their LinkedIn/have they recently started being more active on LinkedIn?
How long have they been in their current role? And at their current company?
Have they recently started being more active on social media (e.g., tweeting about engineering topics)?
Have they recently started a blog?
Have they recently started contributing to open source?

Company-level attributes:

How long has it been since the company last raised money (if not public)?
How has the stock price been doing (if public)?
Have other people been leaving the company, especially management?

Some of these attributes are easier to pull than others (e.g., to track LinkedIn updates, one has to be logged in and has to repeatedly cache candidate activity, which likely violates LinkedIn’s terms of service), but I imagine that there’s enough publicly accessible data to make some guesses about who’s moving. Of course, these guesses will be fairly primitive for candidates who don’t do stuff publicly and loudly — going just off of stock price and/or fundraising history gives you a very crude first pass, but it’s not enough.

When asked their favorite social media platforms, half of engineers in a StackOverflow survey reported Reddit or YouTube — which it’s hard to imagine could ever be leveraged for predictive algorithms — or not using social media at all.

As such, even if this data is easier to get than candidate quality data, figuring out who’s looking is still a data problem and not an AI tech problem.

Matching: Figuring out what engineers want and whether companies have it

Let’s say that we’re somehow able to surmise from public-facing candidate data whether and engineer is both “good” and “looking”. Now we need to figure out whether they’re actually going to be interested in a given company.

To do that, we need to have a list of company attributes that engineers could care about, and then we need to figure out 1) how each company stacks up against this list and 2) which attributes a given engineer cares about. I’ve been a recruiter for about a decade, and below is a decently representative list (in no particular order):

Compensation
Company mission/whether the company is mission-driven
Vertical
Size of whole team and the eng team
Tech stack
Young vs. established
Prestige of the brand/social proof
What problems are being solved/what’s coming up on the roadmap
Chemistry with manager and with the immediate team
The company’s culture and values

Now that we have this list, how do we figure out how each company stacks up? Without a proprietary data set, the main resource you have is a company’s job descriptions. How much of this information can you pull from those descriptions?

Figuring out what companies have to offer

Compensation is increasingly easier to get, especially given recent legislation in some states that makes it mandatory for employers to include salary ranges in their job descriptions. Of course, these ranges tend to be wide, and many aren’t super useful (see the examples below), but it’s something. The screenshot below comes from comprehensive.io, a site that aggregates comp ranges from job descriptions in NY and California, where companies are legally required to disclose them. As you can see, the ranges are quite wide.

Below is an example for a specific company and role: Dropbox’s open Mobile Software Engineer role in the US. As you can see, the ranges are pretty wide (a 66K spread for SF, NYC, and Seattle for instance). In my mind, like many of these ranges, all this tells you is “you’re going to get paid market for the location that you’re in”. Of course, if a company is paying below market, that’s something you need to know, but that’s the exception rather than the rule.

Some attributes like vertical, tech stack, total size, and brand prestige, are more consistent than compensation and are possible to figure out from publicly available data.

Whether the company is mission-driven can sometimes be deduced from the vertical, the company’s B-corp status, and the company’s job descriptions. However, given every company’s penchant for sounding mission-driven, even when they do something as dryly mercenary as ad-serving infrastructure, this may be a bit tricky. But I’m sure the tech is there to figure this one out.

Properties like eng team size, upcoming projects and roadmap, and the company’s culture/values are much harder. Sussing these out is not a technological problem but rather a data one, just like determining whether engineers are good or not. Don’t believe me? Read any job description. How do you begin to figure out from this wall of mush what it’s actually like to work at a company? Maybe some great job descriptions exist out there, and maybe you can get somewhere by scraping sites like Glassdoor, but given how low-signal both tend to be, AI will likely get just as hamstrung as the humans trying to parse these documents.

If you don’t buy that, check out keyvalues.io and look for literally any company. You’ll notice that every company seems to have pulled randomly from the same-grab bag of 20 or so lofty-sounding values that tell you nothing about what actually happens at that company day to day. Instead, you find yourself smack in the middle of a vague virtue signaling arms race.

So, some of these attributes you can figure out from the public domain, and some you can’t. In a nutshell, the data available to you on both sides looks like this:

Now we have to tackle the second part of the matching problem: how do we figure out what each engineer on our list is looking for? For instance, which engineers in our database care about compensation, and what are their salary requirements? And what types of problems are they most interested in? And so on…

The reality is that there isn’t a good way to intuit most of these things from publicly available data. If you want to know what engineers care about, you have to ask them. And even when you ask them, they will very likely, albeit through no fault of their own, be lying.

Figuring out what engineers value and what they’re looking for

Because, to the best of my knowledge, there isn’t a dataset out there that compares and contrasts what engineers said they were looking for versus what jobs they ended up taking, I’ll fall back to an anecdote from earlier in my recruiting career. TL;DR Dr. House was right; everybody lies.

Many years ago, I was interim head of talent at Udacity. Many of my candidates told me that it was their dream to work in ed-tech, but one in particular stood out. In addition to extolling his passion about education, he told me that one of his deal-breakers was working in advertising and that he'd never do it. He did great in our first-round technical screen, and I set him up with an onsite interview. Then, while he was in town, he interviewed at an advertising startup where one of his friends was working. That's the place he ended up choosing… because he really hit it off with the team. Though this particular example was the most stark (the one-letter difference between “ad-tech” and “ed-tech” belies the massive gulf between those verticals), instances like this, where a candidate claimed to strongly want one thing but then ended up choosing something completely different after meeting the team, are the rule rather than the exception.

The truth is, people will tell you all manner of lies about where they want to work, what they want to work on, and what's important to them. But, of course, they're not lying deliberately. It likely means you're not asking the right questions, but sometimes knowing what to ask is really hard. For many of the questions you can think of, people will have all sorts of rehearsed answers about what they want to do, but those answers are framed to the specific audience and may not reflect reality at all. Or, a lot of the time, people simply don't know what they want until they see it.

It’s hard enough sussing this stuff out when you’re talking to candidates 1:1. Imagine trying to gather this kind of nuanced information from what they say on social media or in their public blogs.

Perhaps the most important thing I've learned is that, at the end of the day, one of the biggest parts of a recruiter's job is to get the right two people in a room together. Regardless of industry, or domain, or stack, or money (within reason of course), chemistry is king. Get the right two people to have the right conversation, and everything else goes out the window. Everybody lies. It’s not malicious. It’s just that chemistry is the thing that matters most, and all the rest of the attributes above are poor proxies for the magic that sometimes happens when two smart people have a good conversation.

How do you predict chemistry between people? Can an AI do it? Possibly, if that AI has access to a ton of data about candidates and companies, i.e., everything we’ve discussed in this post thus far… AND past candidate/company interactions and their outcomes.

Even if you can’t get all the candidate and company data you’d need, you CAN get a history of candidate/company interactions and their outcomes from an Applicant Tracking System (ATS). But ATS data is not public. It’s the opposite — for ATSs, their data is their moat, which is what drives retention, and ATS switching costs are painful and often prohibitive.

In the absence of rich candidate and company data and the interactions between them, an AI predicting chemistry is impossible. Hell, humans can’t do it either.

But what if you do have the data?

If you have proprietary data, then you don’t need an AI. A simple non-AI program (e.g., a regression) or a human can do the job well enough. In fact, Arvind Narayan from Princeton gave an excellent talk called “How to recognize AI snake oil”, whose crux is that, for complex questions where you need to predict social outcomes (e.g., recidivism, job performance), no matter how much data you have, because “AI is not substantially better than manual scoring using just a few features”.

Arguably, if you do have the data, you could still build out AI hiring to increase efficiency, but remember that it’s your possession of proprietary data that made an AI approach viable.

Conclusion

So what does this all mean for the future of recruiting? As I said in the beginning, AI is really well-suited to automating a bunch of recruiting tasks that humans do now. For instance, an AI can take the pain out of stuff like this:

Interview scheduling
Composing first-draft sourcing emails (though you’d need a human to truly make them sing)
Enriching candidate profiles with “first-order” data (years of experience, location, figuring out what programming languages they’ve used in some cases, etc.)
Tracking candidate progress through a funnel
Some amount of assessment⁴
Creating beautiful dashboards that track key recruiting metrics (time to hire, cost per hire, etc.)

As AI gets more and more sophisticated, the list above will get longer and longer, and given that most recruiters aren’t particularly good at their jobs, over time, AI will take over more and more, and there will be progressively less for human recruiters to do.

However, the stuff above isn’t what makes recruiting hard; these are the trappings of recruiting, but not the essence. The hard thing about recruiting is figuring out who’s good and who’s looking right now, and bulldozing the way for those candidates to have as many conversations with companies as they have appetite for, in order to see if they have chemistry with that company and potentially their future team.

Until we have access to all the data that reliably predicts whether someone is a good engineer, whether they’re looking right now, and what a company offers, and whether they’re interested in that thing, having an AI will not be enough.

At the end of the day, you can’t use AI for hiring if you don’t have the data. And if you have the data, then you don’t strictly need AI.

And finally, because we don't have the data, humans will also continue to be bad at hiring. The difference is that good human recruiters can make some meaningful warm intros, let 2 engineers get in a room together to see if there's chemistry, and get the hell out of the way.

Huge thank you to Maxim Massenkoff, who did the data analysis for this post. Also thank you to everyone who read drafts of this behemoth and gave their feedback.

Footnotes

If I were you, I’d be justifiably skeptical at this point. Wow, a recruiter is writing about how AI can’t do recruiting. Classic Luddite trope, right? Let’s burn down the shoe factory because it can never compete with the artisanal shoes we make in our homes. In this case, though, you’d be wrong. I walked away from a very lucrative recruiting agency that I built to start interviewing.io, precisely because I wanted to be the guy who owned the shoe factory, not the guy setting it on fire out of spite. Recruiting needs to change, it needs disintermediation, and it needs more data. It’s the only way hiring will ever become efficient and fair. I just don’t think AI is going to be that change. If I’m wrong, I’ll be the first in line to pivot interviewing.io to an AI-first solution. It’s also worth calling out that in this piece, I focus specifically on hiring engineers, as my whole career and all of interviewing.io is dedicated exclusively to technical recruiting, but I expect that my reasoning holds for many other verticals. ↩
In all my years of trying to figure this out, the thing that carried the most signal is how many typos engineers had on their resume, which I think highlights the absurdity of this whole undertaking. See this post and this post for previous data/discussion about the relative importance of employment history and pedigree in hiring. ↩
See this piece by Ashe Dryden for some good discussion about the ethics of open source labor. ↩
You might be wondering at this point why I didn’t list assessments here. Arguably, AI will be able to do some amount of assessment, probably somewhere between a HackerRank/Codility and a real human interviewer. I wrote many years ago about how recruiting engineers is fundamentally a sourcing rather than a vetting problem, so I’m very skeptical that AI-based assessments are meaningfully going to change the game – in order for assessments to be effective, engineers have to be willing to do them, which means there has to be value for both sides. This is why senior engineers typically balk at asynchronous coding tests, and the biggest attrition rates are usually at that part of the funnel. The future of technical assessments, in the age of generative AI is a noble and difficult topic, though, and it’s something we’ll tackle in a future post. ↩