We helped write the sequel to "Cracking the Coding Interview". Read 9 chapters for free

interviewing.io

Browse all interview replays

Life is chaos and pain. Interview prep doesn't have to be.

Video Preview

Get instant access to anonymous mock interviews, salary negotiation, and the world's largest library of interview replays.

MEDIUM

DATA STRUCTURES AND ALGORITHMS

How to Find the Missing Number in an Array (With Solutions in Python & JavaScript)

SEARCHMAPSARRAYS

Written By

Tom Wagner

Githire B. Wahome

An Overview of the Find Missing Number in an Array Problem

The Find Missing Number in an Array problem involves identifying the missing number between two almost identical arrays. While the task is straightforward, the challenge in this problem is appropriately communicating the time and space complexity tradeoffs offered by data structures with constant time or linear time lookups.

Examples of the Find Missing Number in an Array Interview Question

Given an unsorted array of unique integers (size n + 1) and a first array identical to the second array, but missing one integer (size n), find and output the missing integer.

Input: list1 = [1, 2, 3, 4] list2 = [1, 2, 3, 4, 5] Output: 5

Input: [1], [] Output: 1

Constraints

-inf ≤ n ≤ inf
All entries are unique
length(list2) - length(list1) == 1

How to Find the Missing Number in an Array: 4 Approaches

Let’s start by reframing this problem as a real world problem. Imagine you're a warehouse supervisor and you have to conduct a roll call every morning.

To keep the problem statement analogous to our real-world example, let’s refer to the longer list as the register, which enumerates all the workers who are supposed to show up to the warehouse on a given day. Let’s refer to the smaller list as the actual attendance, denoting who actually showed up.

Approach 1: Brute Force

In a warehouse, the supervisor typically has a list of all the workers for a given shift. As the workers arrive, they're crossed out on the list, leaving the absentees.

To conduct our roll call, we run through the register (or longer list), verifying that we have the integers in the attendance list (the smaller list). If any of the workers are missing, we would have found our absentee, so we return it, or break out of the loop and return the integer at which we stopped iterating. The code for this in Python would be as follows:

list1 = [1,2,3,4]
list2 = [1,2,3,4,5]
 
def find_missing_number(list1, list2):
   for integer in list2: # Conduct our roll call against the list2
       if integer not in list1:
           return integer # Break when an absentee is found
 
find_missing_number(list1, list2) # returns 5
list1 = [1,2,3,4]
list2 = [1,2,3,4,5]
 
def find_missing_number(list1, list2):
   for integer in list2: # Conduct our roll call against the list2
       if integer not in list1:
           return integer # Break when an absentee is found
 
find_missing_number(list1, list2) # returns 5

This approach works, but isn't very efficient, as we have to scan the entire attendance list every time we look up a number. Scanning is an O(n) (linear) operation; thus, we have an O(n²) solution (Our example models the worst case as we have to scan both lists exhaustively). The space complexity is O(1) as we only need to keep tracking one integer at a time, the index.

Time/Space Complexity

Time Complexity: O(n²) Space Complexity: O(1)

Can we do better? Of course! Let's think a bit more. What if we didn't have to scan the register each time we wanted to look up a worker? Hash tables / dictionaries and sets to the rescue.

Approach 2: Leveraging `O(1)` Lookups

The first approach works, but scanning the attendance list every time isn't optimal. Can we leverage any data structures that optimize for lookups? Both hash maps and sets allow for O(1) lookups. We can optimize for time by converting the smaller list into a set, then searching for every item in the longer list within the set. The tradeoff with this approach is increased efficiency from a time-complexity perspective at the expense of additional memory / space complexity, as we need an auxiliary hash table or set to do the O(1) lookups.

def find_missing_number_sets(list1, list2):
   list1 = set(list1) # ’Setify’ the list
   for integer in list2: # Conduct our roll call against the list2
       if integer not in list1:
           return integer # Break when an absentee is found
 
print(find_missing_number_sets(list1, list2)) # returns 5
 
def find_missing_number_dict(list1, list2):
   list1 = {elem:True for elem in list1} #Turn into a dict
   for integer in list2: # Conduct our roll call against the list2
       if integer not in list1:
           return integer # Break when an absentee is found
 
print(find_missing_number_dict(list1, list2)) # returns 5
def find_missing_number_sets(list1, list2):
   list1 = set(list1) # ’Setify’ the list
   for integer in list2: # Conduct our roll call against the list2
       if integer not in list1:
           return integer # Break when an absentee is found
 
print(find_missing_number_sets(list1, list2)) # returns 5
 
def find_missing_number_dict(list1, list2):
   list1 = {elem:True for elem in list1} #Turn into a dict
   for integer in list2: # Conduct our roll call against the list2
       if integer not in list1:
           return integer # Break when an absentee is found
 
print(find_missing_number_dict(list1, list2)) # returns 5

Time/Space Complexity

Time Complexity: O(n)
Space Complexity: O(n)

Due to the constant time lookup, we have now achieved O(n) lookup time but now use O(n) space. It seems like the classic quip, ‘throw a hash-map at it’, actually holds its weight.

Is this the best we can do? For time complexity, sadly, yes. But can we optimize for space?

Approach 3: Supervisor Keeping Track of Orders Filled

Consider the list [1, 2, 3, 4, 5], which sums to 15. Now consider another list [1, 2, 3, 5], which sums to 11. Notice anything about these two numbers? The difference between 11 and 15 is 4, which is our answer.

def find_missing_number_simple_sum(list1, list2):
   expected_sum = 0
 
   for integer in list2: expected_sum += integer # Calculate expected total
   for integer in list1: expected_sum -= integer # Subtract filled orders
 
   return expected_sum # Return pending orders
 
print(find_missing_number_simple_sum(list1, list2)) 
 
def find_missing_number_sum_simultaneous(list1, list2):
   expected_sum = 0
 
   for idx in range(len(list1)):
       expected_sum += list1[idx]
       expected_sum -= list2[idx]
  
   # Include last entry of the list2
   return expected_sum + list2[-1] # Return pending orders
 
print(find_missing_number_sum_simultaneous(list1, list2))
def find_missing_number_simple_sum(list1, list2):
   expected_sum = 0
 
   for integer in list2: expected_sum += integer # Calculate expected total
   for integer in list1: expected_sum -= integer # Subtract filled orders
 
   return expected_sum # Return pending orders
 
print(find_missing_number_simple_sum(list1, list2)) 
 
def find_missing_number_sum_simultaneous(list1, list2):
   expected_sum = 0
 
   for idx in range(len(list1)):
       expected_sum += list1[idx]
       expected_sum -= list2[idx]
  
   # Include last entry of the list2
   return expected_sum + list2[-1] # Return pending orders
 
print(find_missing_number_sum_simultaneous(list1, list2))

Time/Space Complexity

Time Complexity: O(n)
Space Complexity: O(1) for most cases, but technically O(n). On the surface this seems like an O(1) space complexity solution, however if the sum of the numbers exceeds the 32-bit or 64-bit integer memory limit the space complexity will actually be O(n). This is why we see the Big Int / Big Integer construct in certain languages.

Approach 4[Advanced Technique]: Bit Manipulation

Please note that this is an advanced technique that is not expected to be known for most programming interviews, so if you are studying for an entry-level, mid-level or Senior position this content is likely overkill relative to what you need to know.

All integers can be represented as 0s and 1s. With just 2 bits, 0 is equivalent to 00, 1 is equal to 1, and 3 is 11. There are only 10 types of people in the world, those who understand binary and those who do not. Ensure you do before proceeding to the next section :)

XOR

XOR stands for 'exclusive OR'. Simply put, it is an OR operation without the possibility of equality. True OR True would be True, but True XOR True would be False. How does this help us?

Integers typically are represented using 32 bits, at least in Python.

The number 1 as base 2 is:

"00000000000000000000000000000001"

The number 10 would be equivalent to the following:

"00000000000000000000000000001010"

So long as an entry is an integer, we only need 32 bits to keep track of it.

If we start with a clean slate, i.e., the value 0, we would have the following:

"00000000000000000000000000000000"

Now, what if we did a bitwise XOR between 0 and 1?

“00000000000000000000000000000000” XOR “00000000000000000000000000000001”

Each positionally equivalent bit will be compared, and we will end up with a 1:

“00000000000000000000000000000001”

Converting this back to base 10 returns 1.

Now let us ask ourselves, what if we did an XOR of 1 and 1?

We would return 0 for that first bit, thus reverting to a clean slate again.

“00000000000000000000000000000001” XOR “00000000000000000000000000000001”

= “00000000000000000000000000000000”

You can play around with bitwise operations here: https://unsuitable001.github.io/BitViz/

Now how is this useful for our problem?

For each integer both in the attendance list and the registry, so long as an XOR operation is performed, equal integers will cancel out their corresponding bit flips and restore the clean slate of the result, all but 1. The absentee integer in the registry will perform bit flips that will not be canceled out. As a result, once we are done with the XOR operations over all entries, the only 1 bits standing will represent the absent number once we are done.

In Python and JavaScript, ^ is used to perform the bitwise XOR operation. As such, you can do 1 ^ 1, which should return 0.

print(1 ^ 1) # 0

1print(1 ^ 1) # 0

Building on this logic, we can start with a 0 as our slate and XOR all integers in both lists against the slate. Upon iterating through all elements, we will be left with our absent integer:

def find_missing_number_simple_bitwise(list1, list2):
   slate = 0
 
   for integer in list2: slate ^= integer # XOR integers against slate
   for integer in list1: slate ^= integer
 
   return slate # Whatever we are left with is the missing int
 
print(find_missing_number_simple_bitwise(list1, list2))
 
def find_missing_number_bitwise_simultaneous(list1, list2):
   slate = 0
 
   for idx in range(len(list1)):
       slate ^= list1[idx] # XOR against slate simultaneously
       slate ^= list2[idx]
  
   # Include last entry of list2
   return slate ^ list2[-1] # Return pending integers
print(find_missing_number_bitwise_simultaneous(list1, list2))
def find_missing_number_simple_bitwise(list1, list2):
   slate = 0
 
   for integer in list2: slate ^= integer # XOR integers against slate
   for integer in list1: slate ^= integer
 
   return slate # Whatever we are left with is the missing int
 
print(find_missing_number_simple_bitwise(list1, list2))
 
def find_missing_number_bitwise_simultaneous(list1, list2):
   slate = 0
 
   for idx in range(len(list1)):
       slate ^= list1[idx] # XOR against slate simultaneously
       slate ^= list2[idx]
  
   # Include last entry of list2
   return slate ^ list2[-1] # Return pending integers
print(find_missing_number_bitwise_simultaneous(list1, list2))

And there we have it, a clean approach that uses a constant number of bits and thus is precisely constant space.

Time/Space Complexity

Time Complexity: O(1)
Space Complexity: O(1), as we will only ever need 32 bits for so long as all our entries are integers.

Practice the Find the Missing Number in an Array Problem With Our AI Interviewer

Start AI Interview

Watch These Related Mock Interviews

Missing item list difference

Airbnb Interviewer

Missing item list difference

The Legendary Artichoke, an Airbnb engineer, interviewed in Python

Watch interview

About interviewing.io

interviewing.io is a mock interview practice platform. We've hosted over 100K mock interviews, conducted by senior engineers from FAANG & other top companies. We've drawn on data from these interviews to bring you the best interview prep resource on the web.

Sign up for a mock interview

Interview Problems

We know exactly what to do and say to get the company, title, and salary you want.

Interview prep and job hunting are chaos and pain. We can help. Really.

App screenshot