Memory efficient lookup
Create some sort of data structure for storing and looking up text in a memory efficient manner.
Read more about the questions
Feedback about Samurai Razor (the interviewee)
Advance this person to the next round?
How were their technical skills?
How was their problem solving ability?
What about their communication ability?
You were quick to start coding, which is good, but also felt a tad sloppy and you needed to go back and revise pretty often. It's fine to write pseudo code in a quick and incomplete way, but when it comes down to writing the actual code solving the problem you'll want to be a bit more deliberate. Great job, though. Good luck!
Feedback about Intergalactic Avenger (the interviewer)
Would you want to work with this person?
How excited would you be to work with them?
How good were the questions?
How helpful was your interviewer in guiding you to the solution(s)?
Interviewer was kind and patient, repeated requirements when asked and helpfully guided me through solution. Great combination of data structure and algorithmic complexity; helpful suggestions to improve code. Only potential suggestion for improvement might be that I didn't understand the problem immediately. This could be entirely due to me, but it might be worth trying different ways to phrase this particular problem.
Samurai Razor: Hello? Intergalactic Avenger: Oh hi. Samurai Razor: Hi. Good to meet you. Intergalactic Avenger: Nice to meet you too. Alright, so we'll just dive on into it. Really question, what computer language are you the most familiar and comfortable with? Samurai Razor: Sure, I think
Java, so on. I think
Javahas hashmaps and
pythonhas dictionaries and I'm sure that
the dogs walkand I want to transform that into
the dog walk, singular. So I don't really have that much use for the multiple words next to each other because for my particular application, I'm just looking up individual words. Samurai Razor: Gotcha, so you're not looking for patterns between the words or within the words. You're looking for meta patterns that can be sort of mappings that aren't inherent in the text but are inherent in language itself, transforms that are applicable to language as a language in general and can be applied to each of the words to kind of create a meta out of data. Intergalactic Avenger: Exactly and to make this a little more concrete, they just take this over that you might say as input, you have a list of these words. So you'll have
dogs dogand dog also goes dogs and
walk walksand walks goes to walk kind of. So you'll have an input list that you'll want to be storing, but you want to store it in an efficient fashion so that you can sort of look them up. And certainly a corpus can help you with this because you might say, well you know I'm not looking up these words in a vacuum, I'm looking at these words in a certain context and I might want to find a more efficient way to store the more commonly occurring words for purposes of storage or something like that or for lookups. Ao from that perspective I would think... where I think you were going with that would be things like storing n-grams of letters that you might say something like, you know the pair of letters
d omight show up very often in this input list and so then you could potentially have a very short sort of code for what
d ostands for and then maybe
w adoesn't show up very often in this input list, so you could have it be long, but that was sort of where I was thinking you were going with that, but yeah. Samurai Razor: Okay I got it. Maybe we can do tries. I'm not sure whether that is applicable to this problem, but Intergalactic Avenger: That's exactly where I was going with this problem and I was going to ask if you've heard of tries, because that is what I see as the canonical sort of best solution to this. Samurai Razor: Exactly, within the try you reduce the source because the traverse like you for every word that starts with s, you use the same storage, so like you use the same s and you go down the tree. I think this is the idea for T-9 word completion and so on. Intergalactic Avenger: Oh yeah. Samurai Razor: So tries would be the way to go probably. Intergalactic Avenger: Yeah okay, so let's just see if we can put one of these together really quick. It's not, you know, it doesn't necessarily have to be the most efficient implementation but in whatever language you like, maybe let's just make a little function that... and you know the interface here you can choose language up on top and it'll even give you a little way to test it out. And yeah, so like the interface the function would be it takes in a word and it returns the the mapping and for simplicity's sake we can even just say that you'll assume that the input word is always there, we don't have to do all of the verification. Samurai Razor: Yeah. So you go from word to mapping... so I mean I'm visualizing like a trie and like I'm trying to think like... so you input a word and then you kind of walk down the tree and kind of follow where... I'm not sure in this case what a mapping... Intergalactic Avenger: So it takes in the word dogs and it returns the word dog. Samurai Razor: Gotcha okay. Okay so it returns that mapping okay. Intergalactic Avenger: So it returns the value of that key value pair. Samurai Razor: Got it, okay. Let's see, so I'm trying to think of how that would work. Intergalactic Avenger: So if you want to actually like do this with all the syntax highlighting you can pick you know like
Pythonor whatever language is easiest for you. Samurai Razor: Got it. I'll try it in
D, and then it points to a child that's
Oand it points to a child that is
G, but what is the output? How do you know what to return at the end of all of this? Samurai Razor: So basically you walk down... So what you determine basically? Intergalactic Avenger: Where do you determine the output? So I'm seeing here like how you walk through this structure based on the input, but then how do you know what the appropriate output is? Samurai Razor: Got it, let's see. So I think the... so you go
D O Gand then or... what if you go
D O G S, it should match dog, yeah how does it match dog. One way to do it would be to have in the in the
Gor in the
Shave a pointer to a string that is where it maps to basically. So you go
D O Gand the
Ghas a pointer to dog. When you go
D O G Sand the
D Othen the
Onode would not have a... or the output property would be
nullbut if you're on the
G, then the output property would have the value
dogbasically. Intergalactic Avenger: Perfect, yeah. Sounds good. Samurai Razor: Okay, let me see. So you will go down the Trie. You go
D O G, the
Gwould have a property for the output. I think that should there should be some methods on the node so this should be a contains method probably or a map, so you should be able to ask a node take this word and map it to the output. Intergalactic Avenger: Yep, or I mean either way, you could have it be on the node itself or you could sort of do it like it says up here where there's a function say hello and it takes in a node and the word or whatever it is, either way. Samurai Razor: It could be it could be based on the word on the actual data or it could be based on node. Yeah totally. You probably don't need that okay. So here we go. So basically first I think we basically need... if you ask a node whether it contains a certain word okay. I think we're good. Do you see any other properties we need or? Intergalactic Avenger: I actually... I'm curious if we even need the value, but let's work through it. Let's just write this up, see how it looks. Samurai Razor: Sure, yeah. Okay so first we need a factory for nodes, so we do something like this. Intergalactic Avenger: Let's even just simplify things for a little bit and say, let's assume that you have the node and you're just going to try to to walk it with some input. Samurai Razor: Okay, so you have a node, we're going to take a node... you're going to basically say node dot get output for dot, basically something that? Okay, so that would be, I'll do it on the prototype it's like a
O(log n)I think because you're like going on a trie, so like for every next letter in the string, you basically choose... Intergalactic Avenger: What is it N in that? Like is that the length of the string, is that the finding out of the node? What is n? Samurai Razor: N would be the length of the string. I'm just trying to think, in a binary tree you go left to right, so you can just do binary comparison. If you have say there are 10 children and if you just put them in an array, then you would have to go through each of the items in the array to find the next child or the next letter. So this probably has to be an object over here, so this with children should be an object, so a hashmap basically, where you can instantly look up the next letter in the string. Intergalactic Avenger: Excellent, yeah. So let's just flesh that out a little bit. That's definitely the right direction to go in. If we said, if we make a couple assumptions here that n is the length of the string and M is the size of the alphabet, so like in English this is basically 26 but in Chinese it's like thousands. Using the children as an array, we're looking at big O of what? And children as an object, it's big O of what? Samurai Razor: So the... if you look at the number of children in an array, you could potentially be the size of the alphabet because the next letter could basically be any of those letters, so you're looking at every level of the tree with a trie, you're looking at a potential of M different letters and you have to go N levels deep, so that n times m at each level m times, I want to say might be wrong with m times n probably, n times I could join the m letters. Intergalactic Avenger: Just type it in there so we can see. And then when it's object? Samurai Razor: So an object, the lookup of a specific next letter will be order 1 because you do a hashmap. And the... you would have to do that at n times, so it would be order n I think, because it n levels of the tree, you're going to have to look up in a table which is order of n. Intergalactic Avenger: Yeah okay, so we're going to choose the the object one for obvious reasons, okay. Samurai Razor: Order one look up. Intergalactic Avenger: Cool okay, yeah let's fill out this this prototype that you have here. Samurai Razor: Okay, so the get out function. Let's see, so what do we need to do. So we check whether... So right now we're just asking some node to check to return the mapping with the output of the given string. So first we to check whether the current node equals the current string character. Intergalactic Avenger: So if you're using the children... if the children is a map from letters to nodes, then do you need to have a current value and do you need to make this check? Samurai Razor: I guess we would have found the this current node bye actually just... even if you're at the top level when you're looking up the first letter in a hashmap, so you know by definition that we're on the current the current letter exists actually, that this node is for the current letter. So basically we would check whether there is a node has a output value. If so, return that. Else if there is a next character in the string then find that child and recurse, else return false or null. Sort of like that. So basically if we're on the current node and there is a mapping, we return the first mapping we find. Intergalactic Avenger: So is that correct? So for example, the the word do, the prefix of dog, do, it if this is an English language thing, it will have a mapping and it will it will have the mapping of do, like do is the dictionary form of the word do. Is that right mapping for dog? Samurai Razor: We might actually want to first use all the characters in the string and then see whether after the last one, when we're after the last character, whether at that point there's a mapping or that's an output, instead of looking for the first output we can find, so like reverse it actually. Otherwise we're going to find... so if we're at the end of the string that we're given, we just check what the output is and return that. If there's no output, we return null. Else if the string still has characters then we just go to the next child. Intergalactic Avenger: Yeah. So let's put that in
Dor something, then it will end up down here, and the previous one will return you know whatever the mapping is for
D, but then it will continue on down and sort of get nothing down here. Samurai Razor: I mean if there is no actual mapping. Intergalactic Avenger: No like let's say there is a mapping. I'm just seeing that the top level that called this, I'm not convinced that it's going to ever see the output because these return statements over here only happen for one case and for the other cases, there's no return statement. Samurai Razor: So you mean like if there is no... like if there is no output at all, then there won't be any return basically? Intergalactic Avenger: Uh no. So let's say there is output and you put in dog and dog is in the map and you and you want to get that back out. I think that this implementation is not going to give you that output at the very end because you know child here is going to get called with dog and it's going to call the substring with og and then this will be called again with the substring of G and then up here it will finally get to returning the the output here. That just gets lost sort of here because nothing is actually... this return value kind of just gets thrown away. Samurai Razor: Gotcha, I think that's exactly, I see what you mean. Yeah you do return it's like at the bottom leaf so to speak, but then it doesn't go actually up the chain. So I think we might need to do this. We at least need to make sure that the top level intermediate levels return the actual value that gets returned from the bottom. Yeah totally. Rookie mistake. Intergalactic Avenger: No problem. So one other thing I'm noticing up in here. It's a very minor point, but do you see a way that this is a little bit more... this piece from sixty to sixty five is more convoluted than it needs to be? Samurai Razor: I think we can probably simplify a little bit, let's see. So if the string dot length is zero and this output is null, then return output else return null. Well I mean one thing I can think of is we can combine the two if statements. Intergalactic Avenger: So specifically speaking from sixty to sixty-five, that I think there's more lines of code there. Samurai Razor: Okay um, we could just return output. Yeah, so output is either going to be a value we want or null and either way we want to return exactly that, so we basically could say... I'm just going to edit this here... basically we could turn this into that. Intergalactic Avenger: Exactly. There you go, you've got a trie. Very good. Samurai Razor: I like that they were doing a trie, I've never implemented that but it's a fun data structure. Intergalactic Avenger: Yeah, and as you can see, the lookup function is very simple. It's just what? Five, six lines of code we have here. So here's an interesting question for you. If you take this function and copy and paste it and now turn it into put output. So let's say you want to add something to this trie, and instead of pulling out the value, put the value in. And what you'll notice, if you think about it for a sec, is that the two implementations are very similar with only one very small change. Samurai Razor: Got it, let's write that. So I'm going to copy the entire thing. Put it down here, and call this add node. Intergalactic Avenger: Or add value. Samurai Razor: Yes. I'm going to add a string. So let's see... Intergalactic Avenger: So when we add it, we're adding a mapping. So remember that what's happening is we're giving it as input basically this list of like dogs to dog and walked to walk and so forth, so we're actually going to be adding a mapping to it. So we're going to be adding a key and value. Samurai Razor: Got it okay. So we're going to basically traverse down the trie up to the... I'm going to take key, go through it and at the end of key, can we assume that... maybe if we start by assuming that the key already exists, the entire like chain is present, and then at the end of key, the last node, add a mapping to the value. Does that make sense? So again, if string is length 0, then add value as output. We don't need this step and that step we do need. So basically we're going to do if string dot length equals 0, then this dot output equals value, else var child equals this children key 0. Not sure if we need to return... so this would be key of substring one and we could... I mean depending on what we want the output to look like we could return true for success, but... Intergalactic Avenger: Let's just not have it return anything, it doesn't need it. Samurai Razor: Yeah, it only goes down the tree and just performs the action. Intergalactic Avenger: Alright, so you're definitely along the right the right track here. So we had renamed str to key, so you want to change that on line 71. This still says get output. And add value takes two parameters. Samurai Razor: Yep, you're totally right. Let's see. So this would be the key and the value would be values, you have it. Intergalactic Avenger: No that sounds good. So the interesting thing here is that it does make a rather stark assumption that the the key is already in place and so if the key is already there, then this will work. But let's add just a tiny little bit of code so that if the key is not in there, that it will go ahead and add it. Samurai Razor: Cool, yeah so sure. So we're going to do that over here. Let's see, it should be good. This is children equals new node. Intergalactic Avenger: So you just need to set child again. Samurai Razor: Yes. This is like a double code. So basically all I need to do is if this is like you know repeatable piece of code, so all I need to do is say if the children do not yet have that next letter as a property then create a new node and add it. So all I need to do is if that's not the case, then we do this else do that. Intergalactic Avenger: There we go so. Samurai Razor: What we want to find is the next character. We say if among the children of this node there is no such characters yet and we make a new node over here and add it to the children and then once we have that child we can say we get that child and we recurse next char. Intergalactic Avenger: The child was right the first time. Samurai Razor: It was? Okay, thanks. Yes right. There we go. Intergalactic Avenger: Alright, and now you've created a trie. Alright yeah that's looking really good. So did you have any last questions for me? I think that's pretty good. Samurai Razor: Cool thanks, thank so much for time by the way. Yeah I mean, so for me like I'm sort of getting back into the game, so this is like one of my first... this afternoon I did like a whiteboard interview, so this is like very, you know. well shared coding. So it's been really interesting. I love that I'm getting back to tries, that's really good. So whatever feedback you have is totally welcome. Intergalactic Avenger: Okay great, yeah I'll write it up in the little report. Samurai Razor: Oh perfect. How does that work? Is there like a count or like is it automatically sent? Or something like that or? Intergalactic Avenger: Yeah, at the end of it, like I write up a little thing and you can see the feedback and all that kind of stuff. Samurai Razor: Perfect, cool. Thank you so much. Intergalactic Avenger: Yup, take care. Samurai Razor: Thanks man, I appreciate it. Intergalactic Avenger: Yup, bye.