MEDIUM
DATA STRUCTURES AND ALGORITHMS

# How to solve the Find Missing Number In Array (Two Arrays) Problem

ARRAYSSEARCHMAPS
Written By
Tom Wagner
Githire B. Wahome

## Introduction Find Missing Number In Array (Two Arrays)

The Find Missing Number In Array problem involves finding the missing number between two almost identical arrays. While the task is straightforward, the challenge with this problem is in appropriately communicating the time and space complexity tradeoffs offered by data structures with constant time or linear time lookups.

## Problem Find Missing Number In Array (Two Arrays)

Given an unsorted array of unique integers (size n + 1) and a first array identical to the second array but missing one integer (size n), find and output the missing integer.

### Example Inputs and Outputs

#### Examples

Input: list1 = [1, 2, 3, 4] list2 = [1, 2, 3, 4, 5] Output: 5

Input: [1], [] Output: 1

Constraints

• `-inf``n``inf`
• All entries are unique
• length(list2) - length(list1) == 1

## Solution Find Missing Number In Array (Two Arrays)

Watch Someone Solve Find Missing Number In Array (Two Arrays)
Airbnb InterviewPython Interview
Advance this person to the next round?
Technical Skills:
4/4
Problem Solving Ability:
4/4
Communication Ability:
4/4
Show Transcript

Let’s start by reframing this problem as a real world problem. Imagine you are a warehouse supervisor and you have to conduct a roll call every morning.

To keep the problem statement analogous to our real-world example, let’s refer to the longer list as `the register`, which enumerates all the workers who are supposed to show up to the warehouse on a given day. Let’s refer to the smaller list as the actual `attendance`, denoting who actually showed up.

#### Approach 1: Brute Force

In a warehouse the supervisor typically has a list of all the workers for a given shift. As the workers arrive they cross them out on the list, leaving them with a list of absentees. To conduct our roll call, we run through `the register` (or longer list), verifying that we have the integers in `the attendance` list (the smaller list). If any of the workers are missing, we would have found our absentee, so we return it or break out of the loop and return the integer at which we stopped iterating. The code for this in python would be as follows:

``````list1 = [1,2,3,4]
list2 = [1,2,3,4,5]

def find_missing_number(list1, list2):
for integer in list2: # Conduct our roll call against the list2
if integer not in list1:
return integer # Break when an absentee is found

find_missing_number(list1, list2) # returns 5``````

This approach works but is not very efficient, as we have to scan the entire `attendance` list every time we look up a number. Scanning is an `O(n)` (linear) operation; thus, we have an `O(n²)` solution (Our example models the worst case as we have to scan both lists exhaustively). The space complexity is `O(1)` as we only need to keep tracking one integer at a time, the index.

#### Time/Space Complexity

Time Complexity: `O(n²)` Space Complexity: `O(1)`

Can we do better? Of course! Let's think a bit more. What if we did not have to scan the register each time we wanted to look up a worker? In come hash tables / dictionaries and sets to the rescue!

#### Approach 2: Leveraging `O(1)` lookups

The first approach works, but scanning the attendance list every time is not optimal. Can we leverage any data structures that optimize for lookups? Both hash maps and sets allow for `O(1)` lookups. We can optimize for time by converting the smaller list into a set, then searching for every item in the longer list within the set. The tradeoff with this approach is increased efficiency from a time-complexity perspective at the expense of additional memory / space complexity, as we need an auxiliary hash table or set to do the `O(1)` lookups.

``````def find_missing_number_sets(list1, list2):
list1 = set(list1) # ’Setify’ the list
for integer in list2: # Conduct our roll call against the list2
if integer not in list1:
return integer # Break when an absentee is found

print(find_missing_number_sets(list1, list2)) # returns 5

def find_missing_number_dict(list1, list2):
list1 = {elem:True for elem in list1} #Turn into a dict
for integer in list2: # Conduct our roll call against the list2
if integer not in list1:
return integer # Break when an absentee is found

print(find_missing_number_dict(list1, list2)) # returns 5
``````

#### Time/Space Complexity

• Time Complexity: `O(n)`
• Space Complexity: `O(n)`

Due to the constant time lookup, we have now achieved `O(n)` lookup time but now use `O(n)` space. It seems like the classic quip, ‘throw a hash-map at it’, actually holds its weight.

But is this the best we can do? For time complexity, sadly, yes. But can we optimize for space?

#### Approach 3: Supervisor keeping track of orders filled

Consider the list `[1, 2, 3, 4, 5]`, which sums to 15. Now consider another list `[1, 2, 3, 5]`, which sums to 11. Notice anything about these two numbers? The difference between 11 and 15 is 4, which is our answer.

``````def find_missing_number_simple_sum(list1, list2):
expected_sum = 0

for integer in list2: expected_sum += integer # Calculate expected total
for integer in list1: expected_sum -= integer # Subtract filled orders

return expected_sum # Return pending orders

print(find_missing_number_simple_sum(list1, list2))

def find_missing_number_sum_simultaneous(list1, list2):
expected_sum = 0

for idx in range(len(list1)):
expected_sum += list1[idx]
expected_sum -= list2[idx]

# Include last entry of the list2
return expected_sum + list2[-1] # Return pending orders

print(find_missing_number_sum_simultaneous(list1, list2))
``````

#### Time/Space Complexity

• Time Complexity: `O(n)`
• Space Complexity: `O(1)` for most cases, but technically `O(n)`. On the surface this seems like an `O(1)` space complexity solution, however if the sum of the numbers exceeds the 32-bit or 64-bit integer memory limit the space complexity will actually be `O(n)`. This is why we see the Big Int / Big Integer construct in certain languages.

#### Approach 3: Bit Manipulation

Please note that this is an advanced technique that is not expected to be known for most programming interviews, so if you are studying for an entry-level, mid-level or Senior position this content is likely overkill relative to what you need to know.

All integers can be represented as 0s and 1s. With just 2 bits, 0 is equivalent to 00, 1 is equal to 1, and 3 is 11. There are only 10 types of people in the world, those who understand binary and those who do not. Ensure you do before proceeding to the next section :)

XOR

XOR stands for 'exclusive OR'. Simply put, it is an OR operation without the possibility of equality. True OR True would be True, but True XOR True would be False. How does this help us?

Integers typically are represented using 32 bits, at least in Python.

The number 1 as base 2 is:

"00000000000000000000000000000001"

The number 10 would be equivalent to the following:

"00000000000000000000000000001010"

So long as an entry is an integer, we only need 32 bits to keep track of it.

If we start with a clean slate, i.e., the value 0, we would have the following:

"00000000000000000000000000000000"

Now, what if we did a bitwise XOR between 0 and 1?

“00000000000000000000000000000000” XOR “00000000000000000000000000000001”

Each positionally equivalent bit will be compared, and we will end up with a 1:

“00000000000000000000000000000001”

Converting this back to base 10 returns 1.

Now let us ask ourselves, what if we did an XOR of 1 and 1?

We would return 0 for that first bit, thus reverting to a clean slate again.

“00000000000000000000000000000001” XOR “00000000000000000000000000000001”

= “00000000000000000000000000000000”

You can play around with bitwise operations here: https://unsuitable001.github.io/BitViz/

Now how is this useful for our problem?

For each integer both in the attendance list and the registry, so long as an XOR operation is performed, equal integers will cancel out their corresponding bit flips and restore the clean slate of the result, all but 1. The absentee integer in the registry will perform bit flips that will not be canceled out. As a result, once we are done with the XOR operations over all entries, the only 1 bits standing will represent the absent number once we are done.

In python and Javascript, ^ is used to perform the bitwise XOR operation. As such, you can do 1 ^ 1, which should return 0.

``print(1 ^ 1) # 0``

Building on this logic, we can start with a 0 as our slate and XOR all integers in both lists against the slate. Upon iterating through all elements, we will be left with our absent integer:

``````def find_missing_number_simple_bitwise(list1, list2):
slate = 0

for integer in list2: slate ^= integer # XOR integers against slate
for integer in list1: slate ^= integer

return slate # Whatever we are left with is the missing int

print(find_missing_number_simple_bitwise(list1, list2))

def find_missing_number_bitwise_simultaneous(list1, list2):
slate = 0

for idx in range(len(list1)):
slate ^= list1[idx] # XOR against slate simultaneously
slate ^= list2[idx]

# Include last entry of list2
return slate ^ list2[-1] # Return pending integers
print(find_missing_number_bitwise_simultaneous(list1, list2))
``````

And there we have it, a clean approach that uses a constant number of bits and thus is precisely constant space.

#### Time/Space Complexity

• Time Complexity: `O(1)`
• Space Complexity: `O(1)`, as we will only ever need 32 bits for so long as all our entries are integers.

## Practice Questions Similar to Find Missing Number In Array (Two Arrays)

MEDIUM
Data Structures and Algorithms
Build a Max Heap

Given an array of integers, transform the array in-place to a max heap.

Watch 1 interview
MEDIUM
Data Structures and Algorithms
Three Sum

Given an array of integers, return an array of triplets such that i != j != k and nums[i] + nums[j] + nums[k] = 0.

Watch 1 interview
EASY
Data Structures and Algorithms