3–Longest Substring Without Repeating Characters
Restating Info We’re Already Given:
- Here, we’re given a string s and have to find the longest substring that doesn’t have any repeating characters. Then, we return the length of that substring.
Questions/Clarifications to Possibly Ask Interviewer (If starting information we’re given isn’t sufficient to begin):
- I see all the test cases only contain letters, should this be able to handle any character (i.e. unicode) or just letters?
Discuss Naive Solution:
Naively, we should be able to check every substring for repeating characters. To do so, we would use a double for loop for the start and ending indices and check that substring for repeating characters, storing the max length as a variable outside the loops updated whenever we find a longer length substring with unique characters.
Discuss Optimal Solution:
Instead, we can solve this in linear time. To do so, we need a dictionary to store the most recent index for each character, lastOccurenceOfChar, and integer variables to represent the longest substring length, longestSubstringLength, and the first index for the current substring, currSubstringFirstIndex. The dictionary will indicate to us (in constant time) two things:
- Whether that character has been seen so far in string s
- If it has, then at what index it was seen
With this in mind, we traverse the string s using built-in enumerate() to get the index and the character at the index. For every character, 2 things are possible:
- We haven’t seen the character
- We have seen the character
If we’ve seen the character, we update its entry in the dictionary(updating the value) to the max of two possible values:
- currSubstringFirstIndex: if this is the greater of the two, the last time we saw this character was before we started counting for the current substring.
- lastOccurenceOfChar[char] + 1: if this is the greater of the two, then we are in a scenario where the character we are currently on exists in the current substring and we have to update the currSubstringFirstIndex to 1 after the first time we saw it.
Otherwise, we add/update the entry for lastOccurrenceOfChar and update longestSubstringLength. At the end, we return the longestSubstringLength.
Code Optimal Solution:
This algorithm runs in linear time (one for loop) and has worst case linear space complexity (one slot for each character).
- Empty string
- String with only unique characters
- String with repeating characters where currSubstringFirstIndex should be updated
- String with repeating characters where currSubstringFirstIndex should not be updated