Mechanical Llama, a Facebook engineer, interviewed Supreme Beast in C++
Given a list of words, match all words with other words from the list that are a prefix for the word.
Feedback about Supreme Beast (the interviewee)
Advance this person to the next round?
How were their technical skills?
How was their problem solving ability?
What about their communication ability?
Interviewee did a great job. Went over the feedback after the interview. Main suggestions are: * Problem Solving -- Always helpful if you can think of further ways to optimize (reduce iterations to one-pass over trie) that's awesome but definitely not a huge deal. I would still consider this a successful interview. * Technical Skills/Communication -- To save time during the interview, feel free to skip implementation of some of the helper methods (e.g. get, containsKey) and focus on the core implementation. You can always come back to implementing some of these helpers later. You can also check with your interviewer if skipping some of these steps is ok.
Feedback about Mechanical Llama (the interviewer)
Would you want to work with this person?
How excited would you be to work with them?
How good were the questions?
How helpful was your interviewer in guiding you to the solution(s)?
Supreme Beast: Hello. Mechanical Llama: Hello! Can you hear me okay? Supreme Beast: Oh, yeah. Mechanical Llama: Okay, great. Nice to meet you. So maybe, before we start, I can just quickly introduce myself. So yeah, so I have been working in industry for, like three and a half years. And I'm a senior machine learning engineer at a big tech company. Yeah. And I'm pretty new to using this platform with this actually being my second interview, but I've been doing interviews at my company for quite some time now. So hopefully, it should be smooth. But I'd love your feedback at the end as well. Yeah, maybe you can introduce yourself to if you want to, you don't have to either. Supreme Beast: Yeah, I'm a senior software engineer, working in a machine learning space for more than 10 years. I'm currently interviewing and have some on site schedules and using the site to prepare. I have some interviews already. Mechanical Llama: So cool. Sounds great. Yeah, so I guess we can probably get started then. Just a heads up. Normally, when I asked this question, it's during like a 45 minute interview. So maybe we can work on it until like 3:45, and then stop, and then I can give you feedback. Or if you have feedback for me, we can do that as well. Does that work? Supreme Beast: Yep. Mechanical Llama: Okay, cool. Yeah, so just start writing out the problem then. So basically, what we have is, for input, a list of words that looks something like this, the list of words, that's the input. And then for the output, I want you to return a list of pairs, where the first word is the prefix of the second word. So just like looking at this example, right, Apples, App, and Apple are present in the input list, right? And app is a prefix of apple. And then like that app is also a prefix of apples. And then we have Apple, apples, right? And then car and cars. And I think there shouldn't be any other parents in this list. So I just want you to write a function basically, that given this above list returns an output that looks like this. Supreme Beast: Alright, so given list of words, return pairs, and prefix followed by the word. I guess one option is to, like reverse the input list of words. Okay, just one question. So when we say return prefix in the second in the output doesn't mean the prefix has to be one of the words in the input list, or it can be any prefix. Mechanical Llama: Yeah, that's a good question. So the prefix must be a word in the input list. Supreme Beast: Okay. So then, we start kind of think over kind of like, brute force estimation, or scan the input list and then for each word in let's see, let's say it's important to have some kind of hashmap. So I guess for each word, we need to find its prefixes or for its prefix, we need to find this word, actually, okay. I see. So basically, for each word, we tried to see if this prefix of any other words were more natural structure was this so this would be an array. So another try. Trie maintains prefixes in a trie. If there is, if it's a match, if the prefix matches this prefix of other words, then traverse the path to the end of that word. Okay, and return and return the various prefix word there. So it's actually it's not a brute force, I think it's a good solution. So basically using a trie to recall the trie construction. But basically, given the choice, let me let me draw this example. So let's say we have a general try on each node over trying to maintain, let's say, an array of characters, we can assume let's say that all characters are 26. I mean, all lowercase. We can look to assume that each node is this array of characters. I mean, it says the array of characters and the other aspects of it that would maintain the actual character. So let's say we kind of defined the struct not in the trie. Is that character? Do you think? I guess, I guess we can just do his thing, if it's the children or get... Yeah, I guess they have to maintain the value and then the children web, so the value would be the actual character. Which would be let's say, A, B, or C. And characters would be best if having an array of 26 elements. And if let's say, on the next iteration of the characters, the try will miss the mark. Like, one of the elements is one, let's say. And let's say the first element is one right. So it means that from character here or say it was B, we get to the next character, this would be a so this corresponds to as basically a and it could have actually multiple characters in this case. So we can mark multiple children. So like each, each node can point for multiple children, meaning that it can, it can it can be a prefix of many different words. So, during this, we'll try let's see the complexity and the... Think of the complexities. So I guess this is the last of the of the I guess, to create this trie, you'll have to traverse the input, so
O(n)definitely, reversing the trie, so, if you want to find the prefixes, it would be the length of each word. So like another I guess
O(m)like so. Like or n where n is the number of words multiply I guess yeah, half a multiplied by M which is like the M families are like the length of the longest word. That's the time complexity. And for state complexity will be well trie will take us like
O(n)trie will be
O(n)complexity. Mechanical Llama: Okay, the trie construction right? Supreme Beast: Yes. Well, for the space complexity, and also I'm this focus on optimization. So, given that we have a trie we have to create you have to traverse the entire input, but then for each word in the inputs... I guess firstly construct the trie so, to be like to pass problems like first pass we will have to reverse the input and create the try. In other words in one pass, so I think we will have to build it. So like pass one, we'll build the trie to insert each word. And then pass two would be like, going from each word. And let's find, for each word try to find the complete word basically, for each words in input look up, look it up in tri and tri to find complete word. So like, let's assume that each word is a prefix and we will try to find the completion kind of thing. The final the total, like the complete word, right? So the words will be like the prefix. I guess the rest will be a suffix, right? So like trying to find the suffix for each prefix, or suffix is every word in the in the input? Something like this. Okay, any questions on this? Mechanical Llama: Um, could you elaborate a little bit more on pass two, to like how you're going to traverse to trie? Supreme Beast: So, traversing a trie, so let's say we have let's say we have something like let's, let's take an example, a which goes to the p which goes to p and apple. And ABC will be a will be also prefix for Apples, App, Apple or Apples, right? Okay. Okay, I guess I can draw this. So this is the node pointing to the child. Now, and then this one also pointing to us. So we will have four App will have some kind of end of the word level, then we create the traversal, we usually mark each word like the end of the word. So the word so now we go from app. And you see that there's a there are some children which are not set to zero. So there is some children will say at index corresponding to letter L. So this is an array for 26 elements. And it's initialized to zero. But once we let's say we assign, we create a new node we assign, let's say L, index of L will be zero, so it will be set to one and everything else will be zeros. So this long correspond to index of the letter L in alphabet. So in this case, we see that there is one index, which is not null, not zero, so we continue traversing it, we go to L, we go to E, so we found one word is end and we exit now we see if there's additional words which are in the inputs. And also in the status, right, so we find s and the return, the only thing I think I need to maintain, I might need to maintain like additional data structure for lookups like some kind of unordered map. So once while I'm traversing this path from A to p p l e, I have to kind of look up I mean, one option is simply like you don't need to really separate the the structure we can just say it's a find the end of the letter, I know that I constructed this from the input so I can simply say report there or look at their outputs. Which says prefix add word. Sort of prefix is app, word le then I continue if there are some additional children, I add another pair or tuple in this case, app, apples and by doing for each word, so in this case will be Apple, right so in this case, I go to I traverse to find my word Apple here and I see those children so add another word. And on the next iteration of this loop, I'll add another pair for apples it will be add their Apple apples normally for every other word. So that's how it's going to work. Mechanical Llama: Okay, that makes sense. Supreme Beast: So the only I guess problem is like to create a standard, trie data structure, and then just add the pass one to create it and then pass two to kind of to quickly create the pairs. So let's start with that. I have about 30 minutes. So, so let's see. So let's say we have this struct, let's define the struct first. Well, actually trie node. You will have some kind of constants. That's the first one. And this is array. Yes, okay. In this case, you'll have to maintain like a list of array of pointers. Final guest pointers or links to children. So this is like array of 26 pointers, each pointer is pointing to another node. We initialize a bunch of nodes, I guess we can, we could have done like a vector instead, initializing everything is not always the same. So this is a trie node. And also private variables, then looks at the public plus, sign up. And then for public, you have to provide this end of the word and is n to mark the world when we, in turn, hoping that we can say, Well, now let's say if it contains a key, so some utility function like this contains key. So let's say given some character, this say, well, first, let's translate it to lower. And then the plan, the index will be basically 26, as our index will be set to minus a. So it's like assuming the encoding and lowercase so this will be the index from the first from the first character of the array to a and the rest will be like the index from their sense in order to be incrementally larger than a from A to Z, right? So we assume here. So this assumption, it's like lowercase a to lowercase b. Mechanical Llama: Okay. Yep. Supreme Beast: And then once you find the index, we say, Okay, if index for some reasons, we don't get like ASCII character, but some kind of, like non ASCII character, or like, some number of services you're not supposed to handle here. So if it's exceeding this boundary, this limit of 26 or if it's negative, which also can't happen, if it's some kind of special character, it will return false. Otherwise, return basically, whatever the links are pointing versus like return links in that particular index, activity have to return bool. And like if you contain the key so you just return if it's not equal to null pointer and I was making progress on this. So yeah. Return false if it sits on bothering the index, otherwise return simply if we contain this character, or if we have it in our links or not. Given the character, do we have it in our children, the other utility function will be to put supports character in the node. Certainly can be void. Say you want to put the character in in this node or not. We say that was toLower first. And then think about it, you don't really need nothing new. Nothing is this value basically like we just need the actual links. And the point of the trie node to be the like, we have a pointer in the parent, we will have the pointer to this current node, so we don't need this special the actual index of that of that node will be like will be the index in that array of array of pointers. So we don't need a special value variable. So simply here, if you want to put something in the node, we say links, this is like putting the node in the children of this node, so links of character minus A equals to... Okay, so my bad so in this case, when we say we want to put it in particular note, so let's do this. Let's say we given the node we wanted to add a character to it as a child, so the say assign the child pointer, this index to this node, this will be this is like adding a child... adding a child and I'll do this and then we want to do a get. So this will return the trie node pointing to the model given the character this will return once again transform to lower and let me just get us from this children. So it's like get child. So trie nodes... Whatever this this element of the array pointing to be consistent, okay? And then we want to return this node. So return on it. So let's say we have this trie node structure. Okay. And then let's define the actual tries. So plus trie. That's the rules. Well, let's have a constructor. I guess it's fine. Okay. So let's do a constructor for trie. This would be let's say we have this empty constructor. So just taking find on so this empty root. Now how do we insert to the trie? So we do insert the word. So, word inserts given a string on the non recourse finance and the world the best in the word from disaster on this character of a check. So, if not the content is there already. The other the other ways to get it from the existing node in the trie, so let's say if we already created the function to check for contains this word does not contain the key for this character, then we put a new node basically, visuals... Do node put character and then you trie node. Otherwise, so it's already there. So we have to just get it from the children of that not so node would be child, your child at index responding to this character. Okay, that's it for the first pass, and then the inserting the world. And so then once you finished inserting them, we said node, set the flag is absolutely true. So that's it, and this is this one here. So we insert the node in the trie. So insert the word into the trie. Now we have two lists missing. We need to kind of find if any word in the trie start is some prefix list of names. So that's like where we have nothing. I guess if I need to search normally, we would like search a word in the trie. And this case, I guess, start with prefixes is all we need here. So we need to return I guess, something like a vector of strings. We have to return a string right? So vector of strings start with prefix. Mechanical Llama: No, wait, whatever your vector of strength is, is the method that returns a pair or is this a different function? Supreme Beast: Well, we will call it from a separate function. Let's say we have a main let me actually describe the main, that's how we call this something like create a new trie. Okay, let's try... let me do like for each word in the input, insert word. Without this, this is the first part basically. Yeah. Now the second pass is this. So for each word in the trie, so each word in the inputs need to construct those, I guess pairs right, so maybe we can explicitly make them up there. So first of all, we get the vector of streams restarted as word. So let's say output is with golden word. And we call this function starts. Starts with, and let's say the signature for this function will be giving a string as a part of this particular words, given the words and given the prefix, so W is basically a prefix, W is a prefix. And each one of those words are the words which are the suffix is basically the complete words I mean, it's not the suffix it's like the complete words to start this prefix. So list of words, which start this basic problem. Now once we have that list, we can simply create a set of list of fares. So for each one of the words, also which complete word with this list of words. You do something like this, create a new pair. So let's say for the output will be like the vector of pairs of strings, this will be the pairs. Let's make it more accurate. So like let's do pair of string e would be equal to make pair w and complete word. Actually, I think I'm like this. This should be like this level of complex words and then just end this pair to the list of output. Okay, so definitely the logic so let's start with basically would have to simplify the all the word which start with this prefix. So let me specify the statement. Find all words, which start with string W. Any questions on that? Mechanical Llama: No that makes sense to me. Supreme Beast: So then yes. So then to start over there start over the prefix w and I have about like 10 minutes. So okay, we started similar same as we did here. So, let me copy this line, starting from roots for each one of the characters which one of the characters in this string? So actually looking first initialize this vector of strings, outputs or results. So, for each one of the characters in this input string, we would say age should they go to word. And if not constraints of this character, the triggers about some other people for the snacks now. Yep, ch. So this is like, traversing it. That's okay. So, in this case, we need to check if we if you found like is there any more any words which and this particular letter so, if not, does not contain you then else we would continue. So, I think we have to I think we will have to break in that case. Let me think about this else condition. So now we have to return this list of strings. Remember this, this is this one this is this one. So we return the results. If we found the words, so yes, here when we say it will not complain when we say if nodes is end if not is end, then result push back. Okay, I guess I'll have to keep track on this word. Yeah, I guess I'll have to keep track on the actual word while we're traversing it. So something like let's say we create a new vector of characters. Well, it lives for the stranger. This is string, string s. Well, yeah, this point, we know the index of the characters so we can say that the prefix you don't need to like specify those keep it in memory, we can just say the only thing we need to find is the same thing two conditions if not is ended then the result of the push back I guess here we can see it's a new string every time we do this so string s and then wanted to reverse this thing we would add string s plus equal character. And then and then if not contains the character is if the children contains this character, we get the nodes. This is the end of the word then we say resolve back. Yes. We will see I guess here we'll have to say... Something like this. So this point we're just pushing back s else if it's not the end of the world we continue traversing it so we can just say continue I guess in this case. This is found end of the words. Oh, it's possible that you might find several of these things. So I guess Okay, I need to do something else. I need to do some additional things here. Like you might find several several awards if you continue right. So here we say that we found awards, which starts with S and you push it so once it pushes you have to kind of okay it makes sense so we can keep using so I think we Okay, we don't need to restate anything we can keep using s next iteration, we will simply check it again. So if there if there is additional let's say Apple from Apple to apples, we will check next node that we have s in the node right so we'll go to that node. We get the node for character s and then you find it as n so we will put another s so in this case, the S would contain apples instead of apples so it would work the same. Now here the only thing here like if node contains p don't need to do so what we do in case it doesn't contain the keys. So in that case, we can just say return I guess we can say return results return results and if there was no word found and it would be like empty vector return for in this case, we can also say break which is smart. So the other thing to say else break. So like if node does not contain the keys and the break and return whatever we can so far found something like this. Okay. Mechanical Llama: Yeah, so let's say we have like multiple branches, I guess coming off of the prefix. So will we be able to get all of the words for example, like if we had cat, ca and then car right. And let's say the prefix was like ca right? So with the above, like logic get cats and cars I guess. Does that makes sense. Supreme Beast: Yeah, let me think. So if you want to keep actually yeah, this will be more like once we find a single word by traversing it. So I think it's good to get so we go from... Okay, we start from C let's say from the trie, right. So you have car. And we have also CRT. So, yeah okay, I guess it would answer the question if it was like CR. Something, let's say, so this will be only going to one branch. Yeah, this one branch, but not the second branch. So I guess you're gonna have to kind of backtrack and for each node folders, maybe recursively, or something like that. So basically, we have to traverse each child in this trie instead of following them. Mechanical Llama: So, definitely pretty close. But what you have missing, like, Yeah, I was gonna actually I think you're close. Supreme Beast: So basically, if we find them multiple nodes, so say, okay, we go through each node, we get the nodes. And let's see whether the folders from... so I guess, if we say that the knob contains the starting from C and the root of the tree have a say some will say start from root empty node, and then the next node will be C. So we go to C, we added to the string we check is not and then we go to the next character, next character there will be basically an exception will be as we go to the next year to be added to the string, we go to t. I mean, we check if t is contained it is now here we check only for the missing every time we check for the which character is certain that array, we have to check for all characters in that array. So when the set contains key, this is just a single character. But the actual list of pointers on line 52 may contain several characters. So the contains t, we might have to modify it a little bit so that we return like all the pointers, like to all the children, not just one child now. Mechanical Llama: Yeah or I think you could even have like a maybe like a loop, right? So once you traverse like ca right then and get to the end of that string, right? Then you look at all of the children and then get all the words from that point, right? Because CA is a prefix of basically like, as long as CA is a word, right? It's a prefix of all the branches below it right? And their words. Like another loop? Yeah. Yeah, another while I like a for loop. Take a look at all the children. Supreme Beast: So while this is kind of like, well, as long as you have additional children, we have another like function like while node. I guess for each node, we can just say for each child or something like that. So for each pointer there, so for each point there in the links, basically modify this slightly so that each node we will say. So I guess we'll have to modify this was basically if node, there's no reward and equals back. Something like this. Yeah. Sorry. Yeah. So it must be a breach, not if it's the end of the word. Okay, the only thing we need to get is the actual character. I think I have to modify this a little bit so that I get the actual value for the node and not just the children. So I think that having this value would be helpful to have a character in the trial node itself, much as the children. Then the actual value itself. And maybe... Yeah, and maybe even make it public. So this is access otherwise, it's kind of all based on the children are also but in this case, we have to get the value. So, we say that node ch. So for each, but let's call it value. So, value and you got the value the end s plus equal to this value and also check if node is the final and the end of the world then the push back this thing. Emphasis there for... continue. And I think many parenthesis. For each character in the string, we... Okay, sorry, I actually have to edit here. So like, not equal to not so like if this logic to check the contents has to be here, basically, if contains but before I check the children, so what if contains key character, then do this thing, so there's no difficult to node get. I'll have some modified slightly, I think it's more like the rounds of getting the children you get that you get that from the point. If this is not there, then we will simply I guess, break in this case. Otherwise, once we retrieve the node, because through the children, so go through the children of the node. So first, we'd like to choose another node to the children. But also children, and if end of the words, basically, if found, and the forks, then we would put it back. So once we go through this, we add the character to the string, and then for each child will try to add it to the string as well. And stop if we found the end of the word. Yeah, something like this. Mechanical Llama: Okay, yeah. This looks good. Yeah. So I guess that's kind of like the end of time for like, a real interview. Normally, I guess, like, I would also like, expect candidates to like write some test cases, or go through examples, which I think you were kind of doing as we were like, walking through, like the code, but that is something to keep in mind. Like, generally, it's nice to be like, a couple minutes for testing, but I don't know every company is different. So you might want to check if that's something like the company you're interviewing at cares about. Yeah, other than that, I think this looks good. I can go over some of the feedback I have, that would be helpful for you. I also have it typed up but might be helpful to hear it verbally. Okay, okay, so, okay, yeah. So let me just look at my notes at a higher level. Oh, yeah. So I guess just one thing, like comment on the overall approach, so I think it was really good that you came up with the trie right away, like sometimes candidates like struggle to like come up with the trie solution, but like noticed it right off the bat, which was good. I think sometimes it can be helpful to like, explain a little bit like why this solution is like optimal or better than another solution, like contrast it with another approach. So like, the brute force for this problem is n squared, I guess if you compare every word to each other right? So it's nice sometimes to mention that like why your approach is the best one but it's not a big deal that you didn't. And just something to keep in mind. What else? Yeah, and then for the actual approach, so I think, like what you did is perfectly validly, look for all the words that have that prefix, right and then form the pair's at the end that totally works. There's some ways to optimize this a little bit. So one is that like, so when you're looking for the prefixes, or sorry, all the suffixes, I guess of a word, right? You're kind of doing multiple passes over the trie, right? Because like, like in ca and was like cats cars example, right? Like, you know, you'll have to, like, traverse the trie entirely for like car and cat, right. And then if you have another word like car, right, you'll kind of have to like, traverse through a lot of the same nodes, again, so kind of to like avoid traversing the same nodes multiple times. There's also a way to like do this with just a single pass and DFS. So basically, the idea is like, you do like a DFS. Just like once over the entire trie, right? And then as you explore each branch, you keep track of all the words you've seen so far, right? And then once you hit a leaf node, you basically form pairs with all of the words above you. That makes sense? Supreme Beast: Yes. Oh, vector from each iteration of the DFS? Mechanical Llama: Yeah, exactly. So something like that. So that's one way to optimize another way to optimize is actually like sorting the input list in advance. So that way, like, when... basically, as you're inserting, right, you can also check if there's a prefix of that word, already in the try, right? So that way, like, as you insert, you're also checking for prefixes. So those are just like, two different ways to optimize. And I wrote them down as well. So you remember, but yeah, that's just something to keep in mind. But I think even your approach, like was pretty good. It's like hard to like, sometimes come up with optimal approaches, just like short time. Yeah. But yeah, other than that, I felt like you did a great job. I had some other comments, but they were pretty minor. So I just typed them up. Yeah, yeah. Did you have any feedback for me? Supreme Beast: I think it was very helpful. I mean, with kind of great feedback, and very good question. So it was very helpful. Mechanical Llama: Thank you. Yeah. I'm glad it was helpful. Yeah, and if you have any other questions, I can stay for another five minutes otherwise, so. Supreme Beast: I think I'm good. So yeah, I'm gonna go through this again after the interview after the interview, and it was a good exercise. Mechanical Llama: Okay, cool. Yeah, that's great meeting you and hope you enjoy the rest of your Sunday. Supreme Beast: Thanks. You too. Mechanical Llama: Okay. Bye. Take care.