DATA STRUCTURES AND ALGORITHMS>SEARCH

Longest Substring Without Repeating Characters

MEDIUM
Written By
tom-wagner.png
Tom Wagner
kenny-polyack.png
Kenny Polyack

ProblemLongest Substring Without Repeating Characters

Given a string `s`, find the length of the longest substring without repeating characters.

Example Inputs and Outputs

Example 1

Input: "abcabcbb"
Output: "3" The longest substrings without repeating characters are `abc`, `bca`, and `cab`, all with length 3.

Example 2

Input: "bbbbb"
Output: "1" The longest substring without repeating characters is `b`, with length 1.

Example 3

Input: "kvvfev"
Output: "3" The longest substring without repeating characters is `vfe`, with length 3.
Constraints
  • 0 <= s.length <= 1000000
  • The character set for `s` is ASCII.

Solution Longest Substring Without Repeating Characters

Longest Substring Without Repeating Characters Approaches

When asked to explore a "subsection" of an array or string, always confirm your understanding of the search space. Remember that a substring is a contiguous slice of characters taken from a string, as opposed to a subsequence which can skip characters, or a permutation which can be out of order. With this knowledge, we can rule out any need for backtracking, and instead start to think more greedy.

Another important thing to confirm with your interviewer when handling strings is the character set, as this will help us define the space complexity of the problem. A safe assumption is that our solution will implement ASCII (128 characters) or Extended ASCII (256 characters)

1. Brute Force

With the search space defined, a naive approach becomes fairly straightforward: generate each substring from the input string, check if there are any repeating characters for each, and record the max length found for valid substrings.

We can separate this work into two subproblems. First, we need a helper function to determine if a given string has any repeating characters - this can be done by scanning a given range and counting character occurrences with a hash map.

Next, to generate each substring from the input string s, we can enumerate all substrings with two nested loops. Given i and j as all possible start and end indices for a substring of s, we have 0 <= i <= j <= n, where n is the length of the input string.

Longest Substring Without Repeating Characters Python and JavaScript Solutions - Brute Force

class Solution:
    def is_unique_within_range(start: int, end: int) -> bool:
        chars = set()
        for i in range(start, end + 1):
            char = s[i]
            if char in chars:
                return False
            chars.add(char)
        return True
    def length_of_longest_substring(s: str) -> int:
        result = 0
        for i in range(len(s)):
            for j in range(i, len(s)):
                if (self.is_unique_within_range(i, j)):
                    result = max(result, j - i + 1)
        
        return result
TIME / SPACE COMPLEXITY ANALYSIS
  • Time Complexity: O(n³). Two nested loops is O(n²), and the helper function makes an additional nested loop.
  • Space Complexity: O(min(n, m)), where `n` is the length of the string `s` and `m` is the size of the character set.
2. Sliding Window

To optimize the naive approach, let's see where duplicate work is being performed.

Consider this example: if the string "abc" has no repeating characters, and we append a new character "d" at the end to make the string "abcd", do we need to re-check if "abc" has repeating characters? Our two nested loops are doing just that - repeatedly checking sections of the string that have already been evaluated - and our helper function is re-checking for repeat characters without using information from overlapping checks.

Instead of naively enumerating each substring, how can we intelligently skip substrings that we already know to be invalid? Let's look at an example:

longest_substring_wo_repeating_char_1-1.png

Let's iterate i until the substring in the range i to j is valid again. Here, once i is at index 2, there is only one a character in the substring, so we can again try to increase the length by iterating j anew.

This technique is known as a sliding window. The window represents a candidate substring bounded by a start and an end pointer, which expands and contracts based on what we're looking for. Sliding windows are commonly used when searching for an optimal range in an iterable data structure like a string or an array. This is because the algorithm will first search for a possible answer before then expanding (or contracting, whichever is the priority) to try to optimize. For example, if we were searching for the shortest substring that contains the letters a and b, we would only contract our window if those letters existed in the current substring (valid condition), otherwise we're adding characters until the condition is met.

Conversely, when looking for the maximum possible length, as in the problem at hand, we can only expand the window when the current substring is valid (otherwise, we must first arrive at a valid state by removing characters from the left). Looking back at the example above, j only iterates forward while the condition is satisfied - in this case, all characters between i and j are unique - and i only iterates forward while the condition is not satisfied.

Having made this insight on our substring enumeration, we can further improve the way we check for substrings with unique characters. Remember that a sliding window allows us to take advantage of overlapping subproblems - so, instead of re-checking each window for unique characters, let's track the current character count of the window and evaluate the new state when a single character is added or removed.

More formally, if we know that a substring in the range i to j has no repeating characters, then when adding the next character at j+1, we simply need to check if that character already exists in the previous range. And because we established the character set at the beginning of our solution, this can be determined in constant time using an array of length 128 to represent character occurrence in a substring.

Note that, although there are still two nested loops in our code, the time complexity of the iteration is now linear, as i and j will only iterate over each character once. Unlike our naive implementation, the two pointers are not dependent on one another, and instead iterate based on the state of the range between them, which can be determined in constant time and space.

Longest Substring Without Repeating Characters Python and JavaScript Solutions - Sliding Window

So, let's iterate i and begin considering new candidate substrings, since we've already seen the longest possible substring with i at index 0. But adba is also invalid, so any iteration on j will remain invalid.

longest_substring_wo_repeating_char_1-4.png

kadb is a valid candidate for longest substring since it has no repeating characters.

longest_substring_wo_repeating_char_1-2.png

If j moves to index 4 though, the new substring kadba would be invalid. So, we can deduce that with i at position 0, all subsequent iterations of j will be invalid, since they will contain this invalid substring. We can confidently stop iterating j at this point.

longest_substring_wo_repeating_char_1-3.png

class Solution:
    def length_of_longest_substr(s: str) -> int:
        # List representing ASCII characters to track occurences
        chars = [0] * 128
        result = 0
        i = 0
        j = 0
        while j < len(s):
            right_char = s[j]
            chars[right_char] += 1
            
            while chars[right_char] > 1:
                left_char = s[i]
                chars[left_char] -= 1
                i += 1
            
            result = max(result, i - j + 1)
        
        return result
TIME / SPACE COMPLEXITY ANALYSIS
  • Time Complexity: O(n). At worst, the two pointers both perform linear scans of the string, which would be O(2n). This can be interpreted as O(n) in Big-O notation, since constants are ignored when determining order of magnitude.
  • Space Complexity: O(1). Limited to 128 ASCII characters, space is constant.
3. Sliding Window Optimized

As a marginal optimization, we can avoid two passes with our pointers by improving how we cache character occurrences and update our pointers.

Remember what we determined in the previous approach: once a duplicate character is found in a given window, this candidate substring will remain invalid until one of the dupes is removed. Seeing that we need to arrive at a valid state before we can continue iterating our right pointer, can we improve how we iterate the left pointer? Indeed! Instead of iteratively searching for the next valid position, we can deduce where it would necessarily be: the index directly after the first occurring duplicate's index in the substring. Moving i to this position effectively removes the duplicate from the substring.

Tracking character counts in our hash map won't serve us anymore - instead, let's record the last index at which each character was encountered, so the left pointer can be updated in constant time. Thus when the right pointer finds a repeat, we simply update the left pointer to the position of the last occurring repeat (plus 1).

Longest Substring Without Repeating Characters Python and JavaScript Solutions - Sliding Window Optimized

class Solution:
    def length_of_longest_substr(s: str) -> int:
        result = 0
        hash_map = {}
        i = 0
        j = 0
        while j < len(s):
            char = s[j]
            # If a duplicate is found, update i to our stored next valid position
            if char in hash_map:
                i = max(hash_map[char], i)
            
            result = max(result, j - i + 1)
            # Store the next index for this character, as this will be the next valid position to de-duplicate
            hash_map[char] = j + 1
        
        return result
TIME / SPACE COMPLEXITY ANALYSIS
  • Time Complexity: O(n)
  • Space Complexity: O(1)

Practice Questions Similar to Longest Substring Without Repeating Characters

MEDIUM
Data Structures and Algorithms
K Closest Points

Given a list of tuples that represent (X, Y) coordinates on an XY plane and an integer K, return a list of the K-closest points to the origin (0, 0).

EASY
Data Structures and Algorithms
Reverse Linked List

Given the head of a linked list, reverse the list and return the new head.

HARD
Data Structures and Algorithms
Alien Dictionary

Given a lexicographically sorted list of words, return the order of the alphabet. Also seen as: Given a list of lists in a lexigraphic ordering, return the ordering of the characters in the language.

About interviewing.io

interviewing.io is a mock interview practice platform. We've hosted over 100K mock interviews, conducted by senior engineers from FAANG & other top companies. We've drawn on data from these interviews to bring you the best interview prep resource on the web.

Ready to practice with a mock interview?

Book mock interviews with engineers from Google, Facebook, Amazon, or other top companies. Get detailed feedback on exactly what you need to work on. Everything is fully anonymous.

Computer Dude