interviewing.io is a platform where people can practice technical interviewing anonymously, and if things go well, get jobs at top companies in the process. We started it because resumes suck and because we believe that anyone, regardless of how they look on paper, should have the opportunity to prove their mettle.
At the end of 2015, we published a post about how people are terrible at gauging their own interview performance. At the time, we just had a few hundred interviews to draw on, so as you can imagine, we were quite eager to rerun the numbers with the advent of more data. After drawing on roughly one thousand interviews, we were surprised to find that the numbers have really held up, and that people continue to be terrible at gauging their own interview performance.
When an interviewer and an interviewee match on interviewing.io, they meet in a collaborative coding environment with voice, text chat, and a whiteboard and jump right into a technical question (feel free to watch this process in action on our recordings page). After each interview, people leave one another feedback, and each party can see what the other person said about them once they both submit their reviews.
If you’re curious, you can see what the feedback forms look like below — in addition to one direct yes/no question, we also ask about a few different aspects of interview performance using a 1-4 scale. We also ask interviewees some extra questions that we don’t share with their interviewers, and one of those questions is about how well they think they did. For context, a technical score of 3 or above seems to be the rough cut-off for hirability.
Below are two heatmaps of perceived vs. actual performance per interview (for interviews where we had both pieces of data). In each heatmap, the darker areas represent higher interview concentration. For instance, the darkest square represents interviews where both perceived and actual performance was rated as a 3. You can hover over each square to see the exact interview count (denoted by “z”).
The first heatmap is our old data:
And the second heatmap is our data as of August 2016:
As you can see, even with the advent of a lot more interviews, the heatmaps look remarkably similar. The R-squared for a linear regression on the first data set is 0.24. And for the more recent data set, it’s dropped to 0.18. In both cases, even though some small positive relationship between actual and perceived performance does exist, it is not a strong, predictable correspondence.
You can also see there’s a non-trivial amount of impostor syndrome going on in the graph above, which probably comes as no surprise to anyone who’s been an engineer. Take a look at the graph below to see what I mean.
The x-axis is the difference between actual and perceived performance, i.e. actual minus perceived. In other words, a negative value means that you overestimated your performance, and a positive one means that you underestimated it. Therefore, every bar above 0 is impostor syndrome country, and every bar below zero belongs to its foulsome, overconfident cousin, the Dunning-Kruger effect.1
On interviewing.io (though I wouldn’t be surprised if this finding extrapolated to the qualified engineering population at large), impostor syndrome plagues interviewees roughly twice as often as Dunning-Kruger. Which, I guess, is better than the alternative.
With all this data, I couldn’t resist digging into interviews where interviewees gave themselves 1’s and 2’s but where interviewers gave them 4’s to try to figure out if there were any common threads. And, indeed, a few trends emerged. The interviews that tended to yield the most interviewee impostor syndrome were ones where question complexity was layered. In other words, the interviewer would start with a fairly simple question and then, when the interviewee completed it successfully, they would change things up to make it harder. Lather, rinse, repeat. In some cases, an interviewer could get through up to 4 layered tiers in about an hour. Inevitably, even a good interviewee will hit a wall eventually, even if the place where it happens is way further out than the boundary for most people who attempt the same question.
Another trend I observed had to do with interviewees beating themselves up for issues that mattered a lot to them but fundamentally didn’t matter much to their interviewer: off-by-one errors, small syntax errors that made it impossible to compile their code (even though everything was semantically correct), getting big-O wrong the first time and then correcting themselves, and so on.
Interestingly enough, how far off people were in gauging their own performance was independent of how highly rated (overall) their interviewer was or how strict their interviewer was.
With that in mind, if I learned anything from watching these interviews, it was this. Interviewing is a flawed, human process. Both sides want to do a good job, but sometimes the things that matter to each side are vastly different. And sometimes the standards that both sides hold themselves to are vastly different as well.
Techniques like layered questions are important to sussing out just how good a potential candidate is and can make for a really engaging positive experience, so removing them isn’t a good solution. And there probably isn’t that much you can do directly to stop an engineer from beating themselves up over a small syntax error (especially if it’s one the interviewer didn’t care about). However, all is not lost!
As you recall, during the feedback step that happens after each interview, we ask interviewees if they’d want to work with their interviewer. As it turns out, there’s a very statistically significant relationship between whether people think they did well and whether they’d want to work with the interviewer. This means that when people think they did poorly, they may be a lot less likely to want to work with you. And by extension, it means that in every interview cycle, some portion of interviewees are losing interest in joining your company just because they didn’t think they did well, despite the fact that they actually did.
How can one mitigate these losses? Give positive, actionable feedback immediately (or as soon as possible)! This way people don’t have time to go through the self-flagellation gauntlet that happens after a perceived poor performance, followed by the inevitable rationalization that they totally didn’t want to work there anyway.
I’m always terrified of misspelling “Dunning-Kruger” and not double-checking it because of overconfidence in my own spelling abilities. ↩
The boundary of a binary tree is the concatenation of the root, the left boundary, the leaves ordered from left-to-right, and the reverse order of the right boundary.
Given a 32-bit signed integer, reverse digits of the integer.
Given a string s, find the length of the longest substring without repeating characters.
Interview prep and job hunting are chaos and pain. We can help. Really.