Longest Substring Without Repeating Characters

Longest Substring Without Repeating Characters

Photo by Jack Anstey on Unsplash

The problem of finding the length of the longest substring without repeating characters is a classic algorithmic problem.

The problem is to find the length of the longest substring in a given string s without any repeating characters. In other words, we need to find the maximum length of a substring in s that does not contain any duplicate characters.

For example, given the input string s = "abcabcbb", the longest substring without repeating characters is "abc" with a length of 3. Another possible solution is "bca" with the same length.

Example 1: Input: s = "abcabcbb" Output: 3

Explanation: The longest substring without repeating characters is "abc" with a length of 3.

Example 2: Input: s = "bbbbb" Output: 1

Explanation: Each character in the string is repeated, so the longest substring without repeating characters is any individual character, such as "b" with a length of 1.

Example 3: Input: s = "pwwkew" Output: 3

Explanation: The longest substring without repeating characters is "wke" with a length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Example 4: Input: s = "" Output: 0

Explanation: An empty string has no repeating characters, so the longest substring without repeating characters has a length of 0.

To solve this problem, we can use a sliding window approach. We maintain a window that expands from the left side while ensuring that there are no duplicate characters within the window. We also keep track of the maximum window size encountered so far, which represents the length of the longest substring without repeating characters.

In Python

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        if len(s) < 2:
            return len(s)
        window = []
        windowSize = 0
        for ele in s:
            while ele in window:
                window.pop(0)
            window.append(ele)
            windowSize = max(windowSize, len(window))
        return windowSize

In Go

func lengthOfLongestSubstring(s string) int {
    if len(s) < 2 {
        return len(s)
    }
    window := ""
    windowSize := 0
    for _, ele := range s {
        for i, element := range window {
            if element == ele {
                window = window[i+1:]
                break
            }
        }
        window += string(ele)
        if len(window) > windowSize {
      windowSize = len(window)
     }
    }
    return windowSize
}

The time complexity of the given code is O(n²), where n is the length of the input string s. This is because the inner while loop iterates over the window list and potentially removes elements from the front of the list. In the worst case, when all characters in s are unique, the while loop may need to iterate up to n times for each character in s. Therefore, the overall time complexity is quadratic.

The space complexity of the code is O(n), where n is the length of the input string s. This is because the window list is used to store unique characters temporarily. The size of the window list can grow up to the length of s in the worst-case scenario where all characters are unique. Therefore, the space complexity is linear concerning the input size.

Is there a way to improve the time complexity further?

Yes,

  1. Initialize a dictionary/map, last_seen, to track the last-seen index of each character in the string s.

  2. Set a start pointer to mark the start index of the current substring and a max_len variable to store the maximum length encountered.

  3. Iterate over each character in s.

  4. If the current character is already present within the current window (i.e., its index is greater than or equal to the start), update the start pointer to the index right after the previous occurrence of the character. This slides the window to the right, excluding the repeating character and any characters before it.

  5. Update the last_seen dictionary with the current index of the character, marking its most recent occurrence.

  6. Calculate the length of the current substring by subtracting start from the current index and add 1.

  7. Update the max_len variable if the current substring length is greater than the previous maximum length encountered.

  8. After the iteration, return the value of max_len, which represents the length of the longest substring without repeating characters in the given string s.

In Python

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        if len(s) < 2:
            return len(s)
        last_seen = dict()
        start, max_len = 0, 0

        for i, char in enumerate(s):
            if char in last_seen and last_seen[char] >= start:
                start = last_seen[char] + 1
            last_seen[char] = i
            max_len = max(max_len, i-start+1)
        return max_len

In Go

func lengthOfLongestSubstring(s string) int {
    lastSeen := make(map[byte]int)
    start, maxLen := 0, 0

    for i, char := range s {
        if lastIdx, ok := lastSeen[byte(char)]; ok && lastIdx >= start {
            start = lastIdx + 1
        }
        lastSeen[byte(char)] = i
        currLen := i - start + 1
        if currLen > maxLen {
            maxLen = currLen 
        }
    }
    return maxLen
}

Note: convert the char to byte(char) to match the map's key type, or the compiler will complain.

This optimized solution achieves a linear time complexity of O(n), where n is the length of the input string s, as we iterate over the string only once. The space complexity is O(m), where m is the size of the character set (number of unique characters), due to the usage of the last_seen dictionary.