The Subarray Sum Equals K problem involves finding the number of subarrays in a given array whose elements add up to a number K. A subarray is a contiguous collection of elements in an array, making this problem a great application of iterative search techniques as well as more complex prefix sum algorithms.
Given an unsorted array of integers nums
and an integer k
, find the number of subarrays in nums
whose sum equals to k
.
Input: nums = [2, 2, 2], k = 4 Output: 2
Input: nums = [3, 2, 1], k = 3 Output: 2
Input: nums = [2, 2, -4, 1, 1, 2], k = -3 Output: 1
Constraints
Arrays are common in interviews - it's a fundamental data structure that doesn't rely on specialized experience, so it's a great venue to test problem solving skills.
Plus, when these problems involve some kind of search component, they create a great opportunity to showcase an engineer's ability to consider tradeoffs and optimizations. That's because naively searching through an array is typically programmer's low-hanging fruit: you can almost always make loops upon loops to traverse the search space and perform conditional checks at each iteration. Optimizing from the brute force will often involve additional data structures (incurring space cost) and demand a more careful understanding of the problem.
As usual, let's start with definitions from the problem description. We are asked to search for subarrays - unlike a subsequence or a subset, this is a contiguous collection of elements within the array. Also notice that our expected output is the number of subarrays that meet some criteria, not the actual subarrays themselves - this will be a hint for further optimizations later on.
Since this problem asks us to consider each subarray from the input, the brute force approach involves merely generating all possible subarrays (then performing the check for our criteria).
To do this, we perform two nested for-loops and capture the elements between the two pointers at each iteration. For each subarray we calculate the sum with a linear operation, which in total accounts for a time complexity of O(n³)
, using a global count to track how many sums equal to k.
class Solution:
def subarraySum(self, nums: List[int], k: int) -> int:
count = 0
for i in range(0, len(nums)):
for j in range(i + 1, len(nums) + 1):
if sum(nums[i:j]) == k:
count += 1
return count
1class Solution:
2 def subarraySum(self, nums: List[int], k: int) -> int:
3 count = 0
4
5 for i in range(0, len(nums)):
6 for j in range(i + 1, len(nums) + 1):
7 if sum(nums[i:j]) == k:
8 count += 1
9
10 return count
O(n³)
, where n
is the number of elements in nums. We use O(n²)
time for the nested loops, and an additionally nested O(n)
for calculating the sum of each subarray.O(1)
.As a quick follow-up to the brute force solution, we can see that the way we calculate subarray sum is inefficient. Why re-trace our steps with an additional linear scan, when we can simply track the running sum for a subarray as j
iterates, and resetting the sum when i
iterates? Avoiding the nested linear operation improves our time complexity to quadratic - O(n²)
!
class Solution:
def subarraySum(self, nums: List[int], k: int) -> int:
count = 0
for i in range(0, len(nums)):
current_sum = 0
for j in range(i, len(nums)):
current_sum += nums[j]
if current_sum == k:
count += 1
return count
1class Solution:
2 def subarraySum(self, nums: List[int], k: int) -> int:
3 count = 0
4
5 for i in range(0, len(nums)):
6 current_sum = 0
7
8 for j in range(i, len(nums)):
9 current_sum += nums[j]
10
11 if current_sum == k:
12 count += 1
13
14 return count
O(n²)
.O(1)
for the counts hash map.At this point, we have a decent quadratic solution with constant space - but, one common way to continue optimizing runtime is through the introduction of additional data structures and pre-processing, at the expense of additional space complexity. Be sure to mention these tradeoffs with your interviewer!
Recall that we are asked to return the number of subarrays whose sum equals k, not the subarrays themselves. So, do we really need to generate all the subarrays? What we only need to know is how many times the sum k
can be created with any possible subarrays - and this can be found with some deductive reasoning and a clever concept called prefix sums.
Firstly, let's generate our subarray sums with the aid of some pre-processing. Create a new array and populate it with the cumulative sums of all the prefixes in nums, where prefixSum[1] == nums[0] + nums[1]
and prefixSum[2] == nums[0] + nums[1] + nums[2]
. This alone doesn't capture all the subarrays, only those originating with index 0, but we can apply a clever formula to capture the rest.
If we know the sums of two prefixes, the sum of the subarray between the two prefixes is the difference between the prefix sums. Let's use this notation to indicate the sum of a subarray from index i
to j
- SubSum[i, j]
- and look at an example:
To find the sum of the subarray from index 2 to 3, we can subtract prefixSum[1] from prefixSum[3].
With this prefix array, we can use our same nested for-loop to consider each subarray and get it's sum in constant time; but, this still keeps us to a quadratic runtime.
How can we optimize even further? We would have to limit the algorithm to a single nested loop. We know we can generate the prefix sums with a linear scan, as well as use the formula to find certain subarray sums. Let's apply another clever strategy: if SubSum[0, 3] = 1
and k = -3
, and if we have seen a subarray with the sum 1 - k
, then we know that SubSum[0, 3]
is part of a subarray whose sum is k
. Which is the only thing we're actually looking to know!
Instead of generating the entire prefix array, we iterate through nums while keeping a running sum and track how many times we've seen each prefix sum with a frequency hash. At each iteration, deduce if our target sum (runningSum - k
) has been seen before - if so, increment a global count - and then update the frequency hash. Finally, return the count.
Why do we initialize the hash with {0:1}
? Well, this handles the edge case when targetSum is 0, since this sum does in a sense exist - following the above logic, the number of subarrays with sum 0 is 1, as this is the empty subarray, no elements.
class Solution:
def subarraySum(self, nums: List[int], k: int) -> int:
running = 0
total = 0
sum_frequency = {0: 1}
for i in range(0, len(nums)):
running += nums[i]
target_sum = running - k
if target_sum in sum_frequency:
total += sum_frequency[target_sum]
sum_frequency[running] = sum_frequency.get(running, 0) + 1
return total
1class Solution:
2 def subarraySum(self, nums: List[int], k: int) -> int:
3 running = 0
4 total = 0
5
6 sum_frequency = {0: 1}
7
8 for i in range(0, len(nums)):
9 running += nums[i]
10 target_sum = running - k
11
12 if target_sum in sum_frequency:
13 total += sum_frequency[target_sum]
14
15 sum_frequency[running] = sum_frequency.get(running, 0) + 1
16
17 return total
O(n)
.O(n)
for the counts hash map.interviewing.io is a mock interview practice platform. We've hosted over 100K mock interviews, conducted by senior engineers from FAANG & other top companies. We've drawn on data from these interviews to bring you the best interview prep resource on the web.
Interview prep and job hunting are chaos and pain. We can help. Really.