Longest common subsequence of two strings
Given two strings s1 and s2, return the longest common subsequence of s1 and s2 (with longest common subsequence defined as the longest sequence of characters such that all of them appear in both of the strings, possibly with other characters in between)
Feedback about Stealthy Dictaphone (the interviewee)
Advance this person to the next round?
How were their technical skills?
How was their problem solving ability?
What about their communication ability?
Good job on the whole. You wrote the simple recursive solution quickly (and concisely), and did a good job of explaining your approach and walking through it on an example. When optimizing the code, you used memoization rather than dynamic programming (which is the classic approach for this problem), but it still worked. A few minor points: rather than passing around indices, you could pass substrings instead (though you may have done this for performance reasons.) Also, rather than returning a list of subsequences at each step, you can just return the longest one. Also, you did a good job talking through your thoughts at the beginning and end of the interview, but there was a portion in the middle (perhaps 20 minutes in to 40 minutes) where you weren't saying as much, so make sure you don't stop narrating what you're doing as you get more engrossed in the problem. You may find it interesting to compare your solution to the traditional one here: https://www.ics.uci.edu/~eppstein/161/960229.html
Feedback about Paisley Wallaby (the interviewer)
Would you want to work with this person?
How excited would you be to work with them?
How good were the questions?
How helpful was your interviewer in guiding you to the solution(s)?
Interviewer asked if I had any questions - I said no but wish I would have asked if I got the optimal solution and answered his questions about time complexity correctly. He's likely to provide that in his feedback, but if I did something incorrectly and don't understand why, it would have been helpful to be able to ask a clarifying follow-up question to the interviewer. For this reason, interviewer could consider letting interviewee know at the end if they had correct/optimal answers/solutions and invite questions about that.
s2and returns the longest common subsequence of
s2. And what I mean by that is best explained through a few examples. In the first case if you're given these two strings you would return
“ABAD”so the longest common subsequence is defined as the longest sequence of characters such that all of them appear in both of the strings, possibly with other characters in between, so you're allowed to delete characters to get the longest common subsequence but the order has to be the same. So likewise in this second example you return
“GTAB”which appears contiguously in the first string but there are characters interspersed in the second string and the last example is sort of a simple case where all the characters are the same so you just return the shorter string. Couple of suggestions: don't worry about performance at first, just come up with the simplest most-naive solution you can and if we have time we'll think about optimizing it. Also come up with more test cases if you like and it's helpful to write out your algorithm either as pseudocode or in English in a comment before you start coding, people who do that usually do better and lastly as much as possible say what you're thinking while you're working on the problem because it's helpful for me to know how you're approaching the problem and not just seeing the code. Stealthy Dictaphone: Got it, thank you very much. So I want to think about edge cases first. So I assume this is case sensitive? Paisley Wallaby: Yeah, it's case sensitive. I think that shouldn't make it any more complicated. Stealthy Dictaphone: Right, so we're just simply comparing characters to characters and then an empty string compared to anything is always going to be empty string. One possible edge case I'm thinking of is something like:
“ABBA”, “ABCABA” => “ABBA”. I should be able to use both
“B”s there. Okay so, as far as a solution to this, if I just try to think of a brute force to this I can look at one character in the first string find it in the second string and then just look for every other character that comes after this first character in the second string, or comes after the last character I found in the second string add that to my result and get that same result from each of the characters in the first substring. So I'm greedily looking for the first character or the first occurance of each character in the second substring and getting my result if I were to start with that and I'm greedily doing it for each of the characters. And then I think if I just take the longest of those I think that will be my answer. Paisley Wallaby: So when you say the longest of those you mean. Sorry I missed if you said...so what I got is you said you were going to look at the first character in s1, search for that in
s2and sort of proceed starting from there but is there a part that you're repeating? Stealthy Dictaphone: Yeah so if I were to look for the first character here I would find
“A”and then I would look for
“B”after this, but that's like a greedy approach and so I also need to get the result if I were to start with
“A”but then I need to get the result if I were to start with
“B”because then I would find this and that might even give me a longer result so I need to do that for each of the characters in
s1and then take the longest. So I think that is a working algorithm; I'm sure I could be more efficient than that so I don't know if you want me to try and optimize that more. Paisley Wallaby: No, this is fine to start with. Stealthy Dictaphone: Okay so, you gave me good advice of trying to write some steps here: for each character in
s1finding first occurance in
s2and getting the result. Try this for each character in
s1, take the longest result. So we're going to call this
longestSubseq(s1, s2)and just to verify the signature here I'm going to be returning an actual string. Paisley Wallaby: Yeah, returning a string. Stealthy Dictaphone: Okay, so. So I'm going to get all my results here, so this is all the results for each of the possible characters in
s1. I want to...basically if the character I'm looking at I can't find the second string I don't really have to worry about the result I get out of this particular character because the result I would want is just looking at the next characters and I'm just getting another character. I'm going to find the index of this character.
if(s2Indx === -1)my result is just going to be an empty string, and I'm just going to build these strings with arrays for time complexity and I'll join them later. I want to get kind of recursively, I'm thinking I can say I want the results of this string starting at this index and this string starting at this index. If I'm doing this recursively I need to take these in here, the default values. I need to sort of give my recursive call the string I'm already using. Paisley Wallaby: Say that again? For which argument are you talking about? Stealthy Dictaphone: So if I call this recursively, say I want to find the result...so I'm trying this first
“A”and I find it here now I want to find the longest common subsequence of these last five characters with these last four characters I need to be passing into this recursive call I think.
if (s2)...what do I call this. I want to set the limit here...Okay so,
let result = if that's
-1then we'll just go ahead and return the result otherwise I want to push in the character itself because that's what we're starting with and then I want to push in the characters we find from the
longestSubseq(s1, s2, s1Idx + 1, s2Idx + 1)If I remember the API for
findIndex()correctly the second argument is going to be the from index so it starts searching from that. Paisley Wallaby: And you're welcome to look that up online when it comes to using library functions. Stealthy Dictaphone: Oh, I'm thinking of the array
indexOf(). And then return result, if it's not equal to….Alright but I'm doing the full
s1here and I really don't want to map the whole thing because I want to start at the
s1start index via the for loop instead.
for(let i = s1StartIdx; i < s1.length; i++)....and them I'm going to
longest = findLongest(results)then I'm going to return my longest. I want to walk through that with an example should I? I haven't written this, should I write it first? Paisley Wallaby: Yeah why don't you go ahead and write it first? Stealthy Dictaphone: Okay,
let max = ;And them I'm going to return the
for (let candidate of arr)and
if candidate.length > longest.lengththen
longestgets reassigned to
candidate.So I'd like to try an example now. Let's start with this first example. Paisley Wallaby: Sure Stealthy Dictaphone: Okay, I want to keep track of what my
resultslook like and then I can keep track of what my…..So we start out with
s1Idx = 0and
s1Char is “A”and the result is an empty array then we look for….how did I lose my
s2Idxwhere did that go?
const s2Idx = s2.indexOf(s1char, s2StartIdx). So this is going to be...we're going to find that at 1 and then we assume it's not
-1and we push
“A”and we get the longest subsequence of
s2and all this starting at a later index now. So then we...
s2Idxwe look for that:
indexOf(B, 3), yeah and the results: we are pushing in the
“B”because we found it. No this result right now is just the
“B”and we look for the remainder with the
“A”I need to get to one of these that we're not going to find and see how that works. So this next recursive call the
s2IdxI'm just going to assume it works because it's similar to the previous case becomes
4. And now the next one I want to see how it works:
3and this is where we're looking at the
“Z”and this is where
-1, so now we're pushing in an empty array and we never make the recursive call which is what we want. So then we're looking at….so then
s1Idxgets incremented because we start this for loop back over then we look at the
“D”and that is going to find...we're still looking at the right substring of
s2here every time which is what we want and we're looking for that
“D”and we're eventually going to find that
“D”so results eventually here, results is going to include this and
“D”is going to return, and longest is going the return the
“D”. Yeah, so yeah that's going to return up to this here, so the
“D”is going to come into here and this is going to be one of the results and I think that's going to come into here as one of the results and that's going to come into here. I don't think I can afford the time to go into this in more detail but I assume this is working the way I expect. Should I run it? Paisley Wallaby: Yeah, let's try running it. Stealthy Dictaphone: So, Okay.
s1charis not defined…. Paisley Wallaby: I think you have a lowercase where you meant uppercase. Stealthy Dictaphone: Okay,
findLongest(), I misspelled that. Okay that gave use the result and let's try this one. Okay alright, seems to be working correctly.Okay so I'm happy with this as far as being a working solution… Paisley Wallaby: Do you want to try the second test case just to be sure the one on line 16? Stealthy Dictaphone: Sure. Paisley Wallaby: Okay cool, so as it is what's the running time of your code? Stealthy Dictaphone: Okay, not great. Let's see so we got...so we're branching out for every call...we make potentially one recursive call for every character in
s1so that's going to be
O(n)but then for each of those recursive calls. So if I think of this like a tree we start at our initial call and that branches out to
ndifferent branches and each of those
nbranches is going to also branch out almost
nbranches and so the depth then of that tree is going to be also
nbecause we're doing this for each...doing that until we get to the end of
s1so I'm thinking that's
O(n^n). Paisley Wallaby: Yeah, or exponential. Right so we have maybe 20 minutes left. So can you think of a way to optimize this code? Stealthy Dictaphone: Yeah I'm thinking...I'm wondering if I can memoize the results that we're getting. So there's probably going to be in this big tree that we have with
nbranches per node at
nlevels, I think we're going to be making the same calls a couple of times. There are going to be multiple times where we just got...we're starting with the
“D”here and we're going to be looking at this particular substring in
s2. If I just build a matrix of these results, I think I have two dimensions here: where I'm starting at
s1and where I'm starting at
s2, so I think I can build a matrix of the results I'm getting and memoize them and then check that matrix first before I do the calculations. Paisley Wallaby: Sure yeah that sounds like that'll work. Stealthy Dictaphone: Okay so I will create a
memo. Since it's a multi-dimensional array I'm just going to initialize it in here. So
if (memo == null)then I initialize it:
memo = new Array(). Okay the rows: I'll make my rows possible
s1indexes and I'll make the columns my possible
s2indexes. Okay else
if(memo[s1StartIdx][s2StartIdx])if I find something there then I should return it. Then down here I should set the
memo: memoize result and I should pass in my
memoin the recursive call. I think that does it. Oh I put this in the wrong spot I'll just do that and we return
longest. I could probably make some more optimizations in terms of you know I'm returning a string here and then I'm converting it back into an array here to push the characters in I was trying to do that to avoid concatenating string characters to earlier characters and I'm just creating new strings overr and over again which is bad for time complexity but I still end up creating a new array out of this. This is probably unnecessary. Paisley Wallaby: Well let's test the memoization change first and if that works then you can try to make some more fine grained improvements. Stealthy Dictaphone: Okay so. So what is the problem here? Cannot read property 5 of undefined. So I was apparently wrong that I didn't need to check here but I'm wondering why I was wrong maybe I need to do that so that I create this array first then I map it, let's try that. Still no good, what am I doing wrong here. What I'm trying to do is create an array of the same length of
s1so I have a row for every character in
s1then I'm trying to make each of those another array with a length of
s2so I should always have...maybe I...I'm wondering if my
s1StartIdxis making this call when I'm at the last character of
s1here which I probably, that's probably happening. I don't know if this is my particular problem I think it is. So how do I need to solve this? Here I have my
s2Idxis not equal to...
longestSubseq...I think I just need to return an empty string if either of these indexes are longer than they're allowed to be. So
if(s1StartIdx >= s1.length || s2StartIdx >= s2.length)then that's my longest subsequence I'm going to get out of that. Nope, still a problem. Do you mind if I
console.log()to see what's here? Paisley Wallaby: Yeah. Stealthy Dictaphone: Oh I need to do it before this. And
1can I not map over these empty values that are getting created in this new array that just fills with empty values? Yeah okay so 6 empty values so that might be the problem maybe if I just
memo = memo.mapand I fill it with something first
memo.fill(null). Okay, let me get rid of this
console.log(). Yeah so
“GTAB”. Yeah I didn't realize that I couldn't map over those empty values. Paisley Wallaby: Yeah I didn't know either. Stealthy Dictaphone:
“ABAD”, looking good. Alright. Paisley Wallaby: Okay great. So what's the running time of your code now? Stealthy Dictaphone: Yeah so this is. We're still...I believe this would be
O(n^2)because we are only...This recursive call we're basically getting once we get the results for each basic one then it's basically constant time afterwards so for the initial call I'm doing
ncalls to this and then this is going to each of those, each of those is going to have to go through the
ncalls the one time to get the results but I'm also. So it's approximately something like quadratic I'm also thinking about the fact that I'm going to be scaling somehow on
s2's length so if I'm calling the length of
nhow does the length of
s2factor into this? And it's really happening here where I'm iterating over
s2and here I'm iterating over
s1again so to be more precise it would be
O(n * (n+m)). I believe that would be the answer. Paisley Wallaby: Okay cool. I think we're pretty much out of time but you did well so any questions? Stealthy Dictaphone: Not that I can think of I look forward to the feedback. Paisley Wallaby: Yeah, I'll write up feedback right after. Great well enjoy the rest of your day and nice talking to you. Stealthy Dictaphone: Okay thank you so much, really appreciate it. Paisley Wallaby: Yeah bye. Stealthy Dictaphone: Bye.