C++ Interview with a Snowflake engineer

Watch someone solve the id generator problem in an interview with a Snowflake engineer and see the feedback their interviewer left them. Explore this problem and others in our library of interview replays.

Interview Summary

Problem type

ID generator

Interview question

1) Design a system to generate unique ID's for a healthcheck system.

Interview Feedback

Feedback about Ghost Koala (the interviewee)

Advance this person to the next round?
Thumbs up
How were their technical skills?
3/4
How was their problem solving ability?
4/4
What about their communication ability?
4/4
Overall you did great. My one piece of feedback for problem solving was not to frame it as finding the "bad solution" and "good solution", but just discuss the space of possible solutions and identify tradeoffs. Sometimes this will result in you identifying one solution that is strictly better than another, but other times it will just find tradeoffs and enable a discussion with your interviewer on what the preferred solution is or how to think about tradeoffs in the problem, which is also beneficial.

Feedback about Winged Avenger (the interviewer)

Would you want to work with this person?
Thumbs up
How excited would you be to work with them?
4/4
How good were the questions?
4/4
How helpful was your interviewer in guiding you to the solution(s)?
4/4
1/4

Interview Transcript

Ghost Koala: Hello.
Winged Avenger: Hello, can you hear me?
Ghost Koala: Yes, I can. Can you hear me?
Winged Avenger: Yes.
Ghost Koala: Awesome. How are you doing?
Winged Avenger: Good, how about you?
Ghost Koala: I'm doing pretty good. Thanks for asking.
Winged Avenger: Awesome. Cool. Well, let's go and get started then. So basically, I just want to first talk to an overview of what about your background, and then just kind of what you're hoping to get over today. So we can target this and get you the best experience possible.
Ghost Koala: Sure, absolutely. So yeah, just briefly, my background. I graduated back in 2011. My education is a mechanical engineering. And, you know, my career so far has been purely in manufacturing have kind of worked my way up to product management. And but the whole time, I've been doing like a ton of kind of self taught programming and coding, to the point where I'm now at a point where I think I have enough skills and know how to transition my career into software engineering. And, and that's, and that's what I'm trying to do. So that kind of background, then more specifically, I have, I have about four companies right now that I'm talking to that I'm, you know, kind of going to on site, interviews for the kind of the big one, the one I'm most excited about is Google, which I have on site tomorrow. So really, what I hope to get out of this is just, you know, as much as possible, you know, try to emulate what that you know, interview is going to be like, just to kind of see how well prepared I am or I guess really just to kind of practice and prepare myself for tomorrow.
Winged Avenger: Gotcha. Yeah, that makes sense. Cool. And so I can ask your background, like, for these roles you're interviewing for? Is there a particular like, area of focus that you're like, you know, full stack in front end machine learning, like anything like that?
Ghost Koala: Yeah. So I guess the, the jobs I'm interviewing for are pretty, I would say generalist or backend focused. It's not really specialized in any of the machine learning stuff. I know, some front end stuff, and not really interviewing for that. So yeah, I'd say more back end or systems level programming. Okay. And then I guess, I guess, aside from that, like, kind of some of the feedback I've been getting on some of the interviews I've done is, I guess, what I kind of want to focus on today for myself is just kind of, you know, making sure I kind of walk through the problem and making sure I kind of walk through the brute force approach before kind of jumping into a more optimal approach. And really, you know, kind of giving you an opportunity to see my problem solving strategies versus, you know, kind of jumping straight into some sort of coding or algorithm at the end. So that that's going to kind of be my focus, maybe try to take a more structured approach, rather than, you know, jumping straight into what would be the most optimal way to deal with this.
Winged Avenger: Yeah, that sounds good.
Ghost Koala: And honestly, like, let's all a lot of the feedback I have for people who are interviewing is specifically to do that. So I'm glad you you've already gotten that feedback, and I'm working on it. So we can definitely do that today for sure.
Winged Avenger: Okay, awesome. So then, yeah, so basically, for the rest of this, I was spending most of the time doing a practice technical interview, you know, and like you said, I'll just try and treat this like I do for a normal interview when I'm interviewing a candidate. So we'll kind of target is to hopefully save once I like 10 minutes at the end for you know, discussion or feedback or anything else we want. So we'll say you know, about 45 minutes, which is, I think, fairly standard for the actual technical coding discussion part. So, okay, cool. Well, let's, let's go ahead and get started then. So, I guess first, I got to change this coder pad to what language would you want to code in?
Ghost Koala: C++. So. Okay, got it. Cool. Yeah.
Winged Avenger: Okay, awesome. C++ is a majority of what I work in. Get rid of all the start. Okay. So the problem I want to talk about today is it's hopefully common, start simple. But as we discuss it, and kind of peel back layers and talk about it more and get into learning. Hopefully, there'll be, you know, some interesting challenges and trade offs and all that, that we have to kind of discuss and figure out. Okay. Yeah, so the problem, basically, for example, you're interviewing at Google, a company like Google that has, you know, a huge amount of software engineers use my computer's, you know, obviously, it's going to be deploying a large amount of, you know, apps, services. products, right. So, when you're running a huge amount of services in, you know, production, like web scale at Google, obviously, there's, you know, a huge number of these and more importantly, you need to kind of keep track and monitor them and make sure they're all running properly. Right. Okay. So basically, the kind of the problem I want to deal with today is one important part of you, kind of monitoring systems is specifically having like some sort of unique identifier for each service, so that you know, like, what the difference between two services are or when one goes wrong or something like that. So basically, what we want to do is we're going to implement this, what we'll call a Service ID generator, okay? Basically, the problem is that we need to when our services starts up, they need to get a unique ID that no other services currently running is using. Where we can kind of uniquely identify this instance of this service, you know, it's like this app running in this data center that's doing this thing, right. And then that's basically the the main part of the problem. The other aspect here is that, as I mentioned, the unique requirement here is that no two currently running services should have the same ID. But you could have one service getting ID and it stopped, and then later another service get that same ID. So what that will look like is we're also going to have this like return ID, functionality where when a service is done, it'll tell us that ID is stopped running. Right. Okay. So basically, what this will do, is it kind of, you know, lets us know what's currently running and also what's his time uniquely identify everything as well. So this is this is the problem. Yeah. Yeah. Any questions or anything more talk about?
Ghost Koala: Let's see. So is there any indication of, you know, how many processes could be running? Or I guess, how many unique IDs might I need to issue at the same time, right, I'm kind of thinking about, Do I need to worry about whether I need int or 64 bit integer? You know, do I have to worry about the size of the unique ID.
Winged Avenger: Good question. So let's assume that it's 32 bit integers sufficient to hold all possible things that can be running at any one time.
Ghost Koala: Okay, cool. So 32 bit integers is good. Let's see what else? So what about concurrency? Do I have to worry about multiple processes coming in at the same time either requesting an ID or looking to release an existing ID?
Winged Avenger: Yeah, so for now, let's assume that you can handle the concurrency just by essentially assume there's no concurrency in which you can handle any concurrent case by just putting all like a global locker and everything and then we can kind of figure out this a better way to do that.
Ghost Koala: Okay. And then I guess, maybe the last question, for now at least is, is the functionality I'm working on here strictly restricted to generating and managing the ID themselves? Or do I have to hold on to any sort of information about the processes that own those IDs?
Winged Avenger: Let's say for now? It's strictly just generating and managing IDs.
Ghost Koala: Just generated? Okay. Alright, so I guess, you know, kind of initial thoughts. This is, this feels very roughly kind of like a memory allocation problem, where you know, you have memory, you have kind of a free, free memory stack somewhere that kind of increments as memory is released. And it seems little bit similar to that. But kind of in line with what I was mentioning earlier, I'd like to, you know, kind of go through an outline what a brute force approach would be. And I guess what that might look like is, and I maybe I'll just sketch it out here a little bit. We'll call this the brute force. So this would be you know, I'll, I'll manage a let's call it a vector of integers. These will be made unique IDs. And anytime someone asks for a new ID iterate the whole list, to see what the first available ideas of the first unused integer is. And then the same thing, anytime someone deletes an ID iterate the whole list to find it. So I think something like this should work. The only kind of gotcha here might be that the vector won't store any sort of order. So I might need to get a little bit creative on how I'm adding removing elements to make sure that the first idea I find is the lowest ID, or if that even work, I guess, does that even matter? Maybe that's another question I should ask is, when someone requests a new ID? Is there a preference to you know, use a previously released ID versus generating a new ID? Or am I overthinking this? And should I just generate a new ID each time?
Winged Avenger: I think that's a good question. I think there is no preference on what order of IDs. The only requirement like a sign was just that the ID can be one that's already in use by another service at this time.
Ghost Koala: Okay, so then really, the easiest approach is to simply keep track of an integer incremented by one each time someone requests an ID, and just return that each time because that'll assure that anytime request an ID, they have a unique ID. And in this case, whenever whenever they want to return the ID, it would be a no op, because I don't, I don't need to do anything with that information, I can still have all the information I need to return the unique ID. The main drawback of that approach is, we'll call it I'll call it memory leak, but it's not really maybe it's more so of memory creep. Because even if you only have one process at a time that's ever using a unique ID, the memory usage of this ID generator is just going to continue to grow forever and ever. Right. So the space complexity is O(n), where n would be the total number of processes that have ever needed a unique ID versus we could do a lot better. That would the approach that I'm listing out here is actually a bit better than that, in that now our space complexity is going to be O of I'll call it m now, for M is simply the number of currently active processes that need unique IDs versus the total number of processes that ever needed it. Yeah, so yeah, but I still think we can do better than this, though, mainly because of this iterating through the whole list to find the first unused integer, I think that's probably going to get pretty expensive, as that list tends to grow. And so I'm wondering if there's kind of a way for us to keep track of what's okay, so this is kind of, maybe play more into that, parallel to the memory allocation I was talking about, rather than iterating, through my list of currently used IDs, I'll maintain a second list of ids that had been freed. And that could be I guess, it could be a stack or a queues, we don't really care about the order. And I'll just check if that is empty or not. And if it is not empty, when a new ID is requested, I'll just pop one off of there. And, and we'll give that to whoever is asking for it. And if it is empty, I know that every integer from zero to you know, my last generated ID is currently being used. I'll just increment that counter by one and return that next integer. So the reason that can help us is rather than having this O(n) search here, I'll have a constant time pop from the from the stack or the queue that I have. Gotcha. So that's kind of the general approach I'm thinking of does that make mostly sense?
Winged Avenger: Yeah, I think I think that makes sense. I think kind of the way you approach this all makes. Makes sense. So yeah.
Ghost Koala: All right. Cool. So unless you have any questions, I think maybe I'll go ahead and start coding that up. Yeah, sounds good. All right. So I mean, there's a few ways to do this, I could do an object oriented approach where everything is contained within the object, or I can do it more in a functional style, but then we have to pass a bunch of stuff around. So we're gonna go ahead and do this as a class. And we'll call this ID generator. I always forget this colon at the semicolon at the ends, we'll put that there. And then the public interface for this is going to be pretty straightforward. We're going to have integer getID. That's not the const. And we'll have an integer releaseID. And for this, you're going to have to give me an ID to release. So let's see the unique information here. No, there needs to be information. And I'm not going to tell you anything back actually. That's good. And then here, let's just talk about the data that I'm going to store. So one of them is going to be the lastID that was issued will initialize that at zero. The next will be and I'm wondering if I need to keep track of that IDs that are out? I don't think I do actually because yeah, because I'm only queue. I'm only interested in do I need to give you the next the nextID here that I haven't issued yet or one of the ones that has been released back to me so I actually don't need to store this. Instead, I'll store here a stack of integers free IDs. What is initialized that is empty. Okay, so that's kind of the basic structure of our class here. And I'm just going to go ahead and put the definitions right in line here if that's fine with you.
Winged Avenger: Yeah, that's totally fine. Alright, so pattern. cpp. Yeah, exactly.
Ghost Koala: All right. Let's see for the get Id like we said about earlier if free ID that size is greater than zero, then I'm going to return free IDs dot pop. Otherwise, I'm going to return Last ID, I'm going to do the post increment here, because that will return the value first and then increment it afterwards. Okay, so that looks right. And then for release, ID, let's come over here. Alright, so how do I do release ID? Okay, this should be well. Do I have to worry about misbehaving clients? In other words, let's say I issued the ID one. That the ID has been released back to me, do I have to worry about someone trying to release that ID again?
Winged Avenger: Really good question I see for now. Now let's let's assume the clients are well behaved, but isn't depression we can talk about later, which is how you handle how you handle that.
Ghost Koala: Okay, so this one should be pretty easy, admission free push. And, man, kind of seems like, at the very least, that's the totality of the kind of approach that I was thinking of here. So maybe let's just walk through this once with like a use case, just to make sure it works as expected. Sure, yeah. So let's do you know, say one, getID. So this would return zero. And then asked ID is equal to one ID. Now this will return the one left ID is equal to two, I'm going to get one more ID returned to equal to three. Now, here, I'm going to release ID to one, at least one. So now last ID is still equal to three. My three IDs is equal to one, I pushed that on to the stack, here, creating that push, good. And now five when I get ID. So now my size here, it will not will be will be greater than zero. So I'll pop that value out. So I'm going to get that one. Last ID is still going to be equal to three and free IDs is not going to be empty. Okay, this looks like it does what we want it to do.
Winged Avenger: Yeah, this looks good. Cool. Okay. So yeah, I think this is a good start. So let's first talk about kind of how this approach would run, you know, if we actually had this running in a production system. So So first, let's talk about, like, you know, runtime and space complexity. Okay, let's formalize that. So we'll show the runtime of space complexity of this solution.
Ghost Koala: Sure. So let's see. So the runtime complexity of get ID is constant, because since we're using a stack here, the Pop is a constant time, operation. Incrementing integers constant time. And then the same thing with releasing the ID. So pushing on to the stack is also constant time. So the space or the time complexity is constant for either operation. The space complexity here is going to be in one hour, write this down for you. Check time complexity is O of one. Or get ID in release ID, space complexity. So this is going to be O of we'll call it active IDs. Right? So however many services are currently using one of my ideas. I know that's not right, just released IDs. So doesn't matter how many services are currently active only how many services have offshoot actually not even this is still a little bit more challenging because this is like released IDs minus active IDs. Is that even right? My release IDs start at zero. So, okay, this is more challenging than I expected it to be. So before anything is released, I'm going to kind of walk through this before any ideas have been released. My space complexity is just O(1) because I'm not storing anything. Yeah, as as these get released, then my space complex let's just say now all all people are doing is releasing that that goes to or released ideas. But then as more are requested, then that is going to get reduced to release IDs minus active IDs. So I guess I mean, we can maybe talk about an average space complexity here. And really it would kind of depend on, you know, what's going on on the client side? Yeah. Is it mostly balanced between requests for IDs, and releasing IDs? Is there going to be a time where I get a huge influx of released IDs, and then kind of lull, in which case, I'm going to have a really big, that will be overlooked IDs here. And that could get really, really large? Or is it mostly just going to be new IDs getting requested, in which case, this is going to be more or less, almost zero? As you know, only the handful of ones that get released? I'll use otherwise this, this integer here will increase. So I guess I don't have like one answer for you on that one. But it is going to kind of depend on how it's being used.
Winged Avenger: Yeah, I'm hitting that that makes sense. So it's like that's kind of the it is dependent on the workflow essentially. Right.
Ghost Koala: So it's not exactly, exactly.
Winged Avenger: So like you said, we could talk about what kind of worst case we can talk about how I expected cases as well.
Ghost Koala: Yeah. And I would say probably, so the worst case? That's a good idea. So let's do that. So this one here would be the worst case. Worst case. And I would say the expected case, honestly, I would expect it to mostly be balanced, or mostly requested versus released. So I would actually expect this to be more along the lines of the expected case. Yeah.
Winged Avenger: Gotcha. Yeah. Yes, that makes sense. Okay, awesome. Yeah, so I think so pretty effective case, like we talked about, I think this is a pretty, pretty good solution. Right? You know, you have to do much better than one for this case. Right.
Ghost Koala: Right.
Winged Avenger: So obviously, that's not interesting. So what I do want to talk about now is the worst case, okay, you know, what's specifically as a designer, what the worst case is, this way, you have a huge amount of IDs. For working for this solution, if you have a huge amount of IDE stem released, and you're just taking up a huge amount of memory, what's your stack? Right? Right. So what I want to do is, let's try and actually we like to try and like quantify, not just like space complexity, but like, actually how much memory will use it. Because when especially you talk to my like backend, what was when you're deploying your services to production, you kind of need to be able to like, reason about or allocate memory for your services, right? So let's try and think about, like, how much the actual not just basic blessing the house actual like memory, space given say n.
Ghost Koala: Gotcha. Okay, so what we're storing here is a 32 bit integer, which is four bytes. So, memory use, equals four times n bytes will be the total amount of memory that we're that we're using, in the worst case here.
Winged Avenger: Yeah. And there's, there's actually a bit of extra overhead to actually like, how stacks are implemented in C++.
Ghost Koala: That they were implemented using a heap.
Winged Avenger: I think I think it was my, my understanding the standard library stack is implemented using a dequeue, you're right. Yeah. Yeah. It's like it's just a with... Yes.
Ghost Koala: But I can tell it to use a vector instead. Yeah, but you're right. But I don't know what that overhead would be for the, for the D Gotcha. Actually.
Winged Avenger: No worries. So and I speaking at this point, this is why I wanted to talk to you this sure is, so basically the most like, kind of already based or like, I guess the array based data structures typically use the reason it's a one is it typically uses what's called a doubling approach, where basically like, you know, I have let's say, I have an array like, you know, four slots, right, I know 1234. Now I put on the ultimate stack, if I made a new array of five slots, and copied everything over and then put five here, right. This would be O(n), every push exam every pass, right, which is not ideal. So most of the approachs do is every time you need to reallocate, they doubled. Right, the number of things. Yep, this is we need to be amortized constant time. Because on average, like when you double it by something a four that means If I double up now my next four things will all be constant time. So even though I'm doing for extra work now, each of the next one is only one. So it averages out towards your constant, right? Yep. Yep. So but this basically means that if I have, you know, n things in my stack, the worst case memory is actually two times that.
Ghost Koala: Right? Right. So we'll call this actual reuse is going to be two times four invites, plus whatever overhead of the dequeue overhead within here. Yeah, some some fields probably. Yeah, tracking the amount there. And even the vector is, I think, keeping track of the last entered or the last index, etc. Yeah. So yeah, so good. So yeah, this will be two times four, eight times the number of bytes. Yep.
Winged Avenger: Yeah, exactly. So let's, you know, as we talked about this initial, let's assume for now, we'll end is like, 2 billion bytes. Would we use just to, you know, do some actual math?
Ghost Koala: Yeah. So that would be for 2 billion, this would be 16 billion bytes? Yeah, that's a lot.
Winged Avenger: Yeah, that's 16 gigabytes, right? Yeah. Which is, you know, modern computers have that much. But it's still like, non trivial what we're trying to do, right? Yep. Cool. So now what I want to talk about is, let's assume that, you know, constant time was super fast. And we're actually not but you know, bottlenecked on time or while not on space. So what can we do to reduce the space used, even if it means making our time complexity worse?
Ghost Koala: Okay, because the because the first thing that pops to mind when we're talking about, you know, limitations on space use is just setting a cap, right. And that would be pretty easy to do. Maybe in the constructor of the ID generator, we can set a maximum size for our stack. And then anytime it reaches that amount, we can just instead of whenever something is released, instead of putting back into the free IDs, just do nothing with it. Gotcha. So that's one approach. But that wasn't your question. Your question was, is there a way to trade off? Essentially, time complexity for space complexity in the worst case that we're talking about here?
Winged Avenger: Yeah, I do think we can also explore other approaches for just reducing space. I think so what the one you mentioned is an interesting one. But I think this basically has the same worst cases, the initial solution we talked about, where you just increment an ID and never returned, right? Like you can easily imagine my stack has. And things that if I, like, always have, like, you know, like a kind of working set of n minus one things I'm trying to release, that I can pretty much always go back and forth between getting one releasing two, getting one releasing two and then still go off.
Ghost Koala: Yeah, okay. Got that.
Winged Avenger: An interesting discussion. I just think that for the purposes of this week, we probably do want to have the same behavior solution has where you kind of guarantee that you don't like lose IDs permanently.
Ghost Koala: Right. Exactly, exactly. Yeah. Okay. So let's see. So what can we do? So really, the question is down here, when a release when ID is released? How can we trade off some time complexity to try to say on the storage that we're doing here? So I don't really know. But I guess if I think about it, right now, I have kind of a dumb, I'll call it counter down here of last ID is just the maximum ID. The last ID that I issued, I'm just incrementing that by one. So I'm wondering if there's somehow to add more information to that, to help me to help me reduce the amount of space and I'm using appear on the free IDs stack. So I'm just I'm just gonna, maybe I'm just going to kind of think through that for maybe a minute or so and see, see if I can come up with something.
Winged Avenger: Yeah, totally. Sounds good. No worries. We still have like 20 minutes left.
Ghost Koala: Okay, cool. Not under any time pressure. Can I, as I said, with this problem, I was more interested in like, as we change up things and go peel back layer, seeing how I think about it, it's less, like code 50 lines of intent for the problem. Gotcha, gotcha, okay. It's tough because, you know, I have no guarantees, as the ideas are being released, that they're necessarily going to be released in any kind of order. If they were released in order from the first issued, then that might, you know, that might give me a path there. gonna change up the approach here, but since they could come in randomly? That's the part that's kind of tripping me up here a little bit.
Winged Avenger: Yeah. For sure. It's tricky.
Ghost Koala: I'm wondering how the how the memory allocators do it now?
Winged Avenger: Yeah. Yeah. So I guess that's kind of what we, let's see. So one way to think about this, I guess is like, if you want to reduce memory usage, like, as we talked about, like, maybe hopefully, kind of take a step back and think about like, what? Neither like data structures or whatever, what do we need to represent? About the IDs to pick from? Okay, but right now, we're representing a set of three IDs in a list, right? What else? Gotcha. We need to know, to still Yeah, ignoring code, whatever, what would you do to solve this problem?
Ghost Koala: Now, that's a solid hint, actually. So because if you think about it, really, what I want to know, is a set of intervals that represent the IDs that are currently issued. And then as ID is released, I can, you know, either update an appropriate interval, if that has one to an existing interval, or just add a new interval. So by storing intervals, rather than actual released ideas, I should be able to, you know, I should be able to reduce the amount of space complexity here again, at the expense of now I'm going to have to manage this, this list of intervals, which is going to add to my complexity.
Winged Avenger: Yeah. So I had an interesting idea. I like it. I think, unfortunately, this doesn't actually save a workspace space. Let's say I did a workload, you know, we'll get a ton of IDs stuff to add magic release. 024681012, and right now, my release interval is going to be a bunch of one phase interval. Over two. Right. That's an interesting idea. You like that. But I think for this is this specifically does not? The worst case is still that right? Well, I agree the expected case would probably be good. Because if you think about like a service deployment, if services start at the same time, you could imagine that could mean also likely to stop at the same time. Right. So giving them kind of particularities could make sense. But, but yeah, so I like that idea. But I think this still has the same worst case.
Ghost Koala: Okay. Yeah, I think you're right, too. And I was actually just getting ready to say the same thing, honestly. So. Okay. Okay, so, so back to your point from before, then it kind of take a step back and think about what was said? What is the information that we need about the IDs? Okay, so I guess we could invert it, although I don't know that this would necessarily be much better. But it would kind of be similar to what I had started to run at the first where rather than storing the free IDs, I store the released IDs. And then yeah, I store the released ID. Now I'm sorry, I store the I store the issued IDs. So that now whenever a new ID is requested, that won't be constant time anymore, that'll likely be well, I guess it depends on how we store it. But it was probably an O(n) operation. But now, you know, in the worst case up here that we were talking about, when we have a ton of released IDs, I'm now going to have you know, constant time or constant space complexity. But this is just exchanging one worst case for another because now if I have a ton of requests for new ideas, I'm back in the same situation. Right? Yeah. Yeah. So I don't know that that's necessarily any better.
Winged Avenger: Yeah, that makes sense. Yeah, it's just one of trade offs. Right. Right. Yeah. But why is this interesting? I think this is not necessarily like, the obvious, always best solution.
Ghost Koala: Yeah. Yeah. Well, it's definitely super interesting. But I'll tell you what, I'll be honest. I'm a little stuck as to how to move forward here.
Winged Avenger: Yeah, no worries. No. So I think I think you definitely on the right track. I think there's basically just one way of framing it hasn't come to mind, which is a kind of a, not necessarily obvious, but let's talk about another way. Let's say that, instead of trying to store the set of IDs that are in one state, like, you know, for all breed IDs, or all user IDs, whatever I basically just tried to track for each ID whether it's used or not. Suffering is a separate way, instead of a set of all things in one state, just a for each ID is in use.
Ghost Koala: I see. So how would that help us? Or each ID? Is it used or not? Because what comes to mind when you say that is something like a map, where the key is the ID and the value is whether it's not whether or not as used? Yeah. So you know, as as, as IDs are requested, then I would add a key, I guess, doesn't look more like this. And then as someone comes in and releases it, I just come in here and put this to false. And now, the next time one is requested, I got to search through and see if I can find one that's false to return that one instead and put it back to true. Now, the reason I'm hesitant here, though, is because what's going to happen here is this map is just going to grow with the total number of IDs that's ever been issued. Yeah, yeah.
Winged Avenger: Yeah. So I think we can't avoid the space complexity of O of number of possibilities for the worst case. Okay. Okay. So it's just about how much like, can we save on space, assuming we have to store something for eternity?
Ghost Koala: Right? Yeah, well, I still don't think that this would be better than our over least IDs up here, because this map, because released IDs is always going to be less than or equal to the total number of issued IDs. So this map is always going to be this worst case is going to be the same as the one we were looking at already.
Winged Avenger: Yeah. Yes, so that's true. But I think we can basically just do like this approach. When you think the map is simple. We can do this in a way that's much more memory efficient, right? Okay, I say I have 2 billion IDs. What if I just had an array of rules of size 2 billion?
Ghost Koala: Right. Okay, so that would be more space efficient. And then here, the index equals the ID, and then the value would be equals issued slash not issued? Okay.
Winged Avenger: Yes. So how that how much memory specifically we use for this?
Ghost Koala: I see because a boolean is one byte, not four times two bytes. So this would be two gigabytes RAM.
Winged Avenger: Well, actually, it's a lot, a lot better, right?
Ghost Koala: It's a lot better. Yeah. Okay. So see, I was getting all caught up in the fact that the size would be the same. But this makes a whole lot of sense. I'm storing a smaller bit of data. So that overall, the actual amount of RAM use is lower.
Winged Avenger: You got that? Yeah. Now, that's kind of the thing that is tricky to realize, when you want to optimize this kind of content in general for optimizing memory. Before we store each thing more.
Ghost Koala: Right, yeah, yeah. And honestly, to what you were saying earlier, kind of taking a step back, because when you think about, you know, the time or the space complexity, you're thinking about how does this grow with the size of the inputs? But once we're working on the optimization at the level that you're talking about here, I'm no longer talking about how does the... How does it grow with the size of the input, I'm want to talk about discrete amount of memory being used. Look at what can I do to try to optimize that a little bit better?
Winged Avenger: Yeah. So apologize if I didn't make that clear.
Ghost Koala: No, you did. You did. I just haven't come across this type of question before. Yeah. So it was just unfamiliar to me. But you I mean, now that now that I have kind of that context, I mean, the hints that you gave were really good.
Winged Avenger: Yeah. Okay. Awesome. I'm glad this makes sense. Cool. So yeah, let's talk about like, with this brewery approach, how would I implement like getting released ID number one? Sure.
Ghost Koala: Okay. So with the bool approach, what I would do, so the get ID, this would have to necessarily, I think, Well, again, it depends on the data structure use. But if we're just using an array here, this be O(n), forget ID because I have to iterate through that whole array to find the first false and then that will be the ID that I return and in the worst case that would be at the very end or right next to the end or near the end. For release ID, would it be the same actually? Because depending on where in my array it is, I have to iterate, well, actually, those will be constant. Because I have constant access. So you're telling me to release index i, I can, in constant time access that index and flip that to false for release ID. These are time complexities. Yes.
Winged Avenger: So then also talking about the case you talked about earlier, clients seem to hate this also listening to a, like, release it for me that's like already been released?
Ghost Koala: Well, it won't, won't blow up. Because I'll just, you know, if that, if that id is already set to false, it'll just set it to false. Again, it'll be a no op, hopefully, the compiler can optimize that away. But it also is not providing any feedback either, right. So, you know, the so that's kind of, again, kind of a trade off there is that the client won't know that it tried to delete, like, really, it probably meant to release that ID to a different generator, maybe on a new server. And I'm not, I'm not giving you that feedback here. I guess I could, I could actually make release ID a Boolean return value. And if it's already false, and you asked me to return it, I could return back, you know, a false instead of a true.
Winged Avenger: Yeah, that's not make sense. Cool. Okay. Yeah. So you did that.
Ghost Koala: Yeah, how much time do we have left? Alright, let's see. So and then we talked about an array, I'm going to use a vector here, which I talked about earlier, it's an array underneath the blankets there, but it'll just kind of dynamically grow for me. And this will be bools. And now when you release an ID and I know there's some algorithms in the standard library that can go through and find the first one, I just don't remember them off the top of my head, I'm just going to iterate through my vectors to find the first false, first thing if it was sent to him then what I want to do is def is empty. Now this should be pushed back pushback true. And then I'll give you the zero with id there. Otherwise, let's go through in fact, this will be at the end. So first thing iterate through it for if IDs equals false for that to true and then return that index, otherwise, I need to do this so here adding one more and then returning the size minus one because the size is always going to be one more than the last index that we added.
Winged Avenger: This should be I think this bracket belongs here. There we go. Now for release ID this should be pretty easy. So again, talking about misbehaving clients, they might give me an ID that's outside of the idea that they should already in which case I just don't do anything. Yeah, otherwise I take that put it to false. Yeah, I think that's do it. Okay.
Ghost Koala: That makes sense. You know, obviously, as we talked about, trading off, space or time. So I think, I think and also obviously this if we were memory down to definitely be a good a good trade off, right? Absolutely, absolutely. So, okay, cool. So now, so about five minutes left. So I just kind of want to talk a bit more about the kinds of stuff we mentioned earlier. So for example, we were talking about concurrency. What happens if we have multiple clients accessing this? How do we make this handle that.
Winged Avenger: So the first thing that comes to mind is adding another member variable to our class here, a, a mutex. Notice that we can lock it. And then I would essentially do what I do here, I think it goes like this. So this would, this would create a lock using that mutex. And then the next guy who came, if it didn't, another thread came in and try to use this while I still have the lock on here, it's gonna, it's going to block until the lock is released. And then the way this works in C++ is once it goes out of scope, it'll, it'll go ahead and release it. So this will this will create kind of like a backlog. If multiple threads try to access it together. And then it'll just let one in at a time each one locking it as it as it comes into here.
Ghost Koala: Gotcha. Yeah, that makes sense. Okay, so did you unlock this or how it did?
Winged Avenger: Yeah. So again, I think we had to mark this, actually, I think the way this works, I haven't done this a whole lot. I think it's actually like this. So I think this returns the lock. And then when this my lock goes out of scope, it releases the lock. And then the reason we need this, this class, the mutex class variable, because everyone uses the same mutex to try to try to get their own lock. But if one locks already been issued, then it's going to block until they can get their own. So this, I don't think I have to release it manually. What I'm saying I think because of the our compiler stuff, it'll do it. It'll do it for me when it goes out of scope.
Ghost Koala: Yeah, that makes sense. Okay. Cool. So what do you also have to put that into besides to?
Winged Avenger: Yes, absolutely. Do I, let's see. Yes, I do. And in fact, if I wanted to get really fancy, I put it right before the access here.
Ghost Koala: Yeah. Okay, cool. So, yeah, that makes sense. So we have we had two minutes before I want to start doing my feedback. So this seems like a reasonable starting point. And let's just kind of talk through it. So yeah, so. Okay, so yeah, so nothing wanted to give you some feedback. I also appreciate any feedback you had, as well, we can approach this question we will do, basically, anything that you want, like I said, was set up, I gave you some feedback. So overall, I thought you did really well on this question. And I thought you I liked how you broke down the problem space, I definitely don't expect people to always realize that. Like, kind of reframing of the problem, right? And like, how to just track, does that attract me, so that when you check for each one, whether it's us or not? So I think I looked at once I gave you that I can be able to run with it and want that and figure this out. And in general, because you said you want to focus on problem solving, like this is a very kind of problem solving focused problem. Right, right. It was like, your 20 Maybe? Yeah, but yeah, I definitely want to do this from for that, especially dimension, your backend focus, I think I definitely got to see can you showcase your skills? So when you walk through a lot of different financial solutions talks about their trade offs? You know, I think you clarify the problem with with a start sending something I typically see people trip up on is like, not understanding, you know, the requirements and political idea, the size, the currency, like the uniqueness requirement, and I think spirits are important to get right for this problem. And also, you're talking about clients, he will behave and some of the corner cases that identify those. Two, I think, the one thing that I think I would recommend is I know at the start, we're going to talk about looking forward versus optimal solution. And then problem as we kind of ended up seeing there's not necessarily a most optimal solution called trade. Right? Right. So I mean, for a lot of problems, there is a straight up a worse and a better solution, but I would potentially frame it in a certain like, I need to find a bad solution and a good one. Just frame it more as thinking about what's the potential for Some solutions and like what their trade offs are. And if you find one that is better objectively, and that framing it that way, also typically, in my opinion, helps you articulate why one is better, much worse, right? There's like, Oh, I could solve the problem this way or this way now, like, here's the brute force, here's the awesome, I can solve this way, this way. This is, I'm going to go with this one, because of these characteristics with making while I kind of made it more like problem solving said, it's like, oh, there's a bad solution. Does that make sense?
Winged Avenger: Yeah, absolutely. That's great feedback. I appreciate that.
Ghost Koala: Yeah, that was the kind of the main thing that stuck out to me with the problem, because it's good overall. Besides that, like I said, I'd like kind of your clarifying questions, the way you broke down the phone space, the way you kind of took feedback would be considered, when we had to change the requirements or refund the problem, how you kind of took that in stride and go through it. So yeah, I think, overall, I think you'd agree. I think the console is good. I'm optimistic tomorrow, but I guess the, you know, it is, every interview is different. Right? Yeah.
Winged Avenger: So we'll see how it goes.
Ghost Koala: Yeah, for sure. Feedback for me or anything?
Winged Avenger: I think you did a great job kind of framing the question initially, answering the questions, you know, you did a really like you were good at identifying kind of when to step back. And let me kind of flounder a little bit. But also, when to kind of drop in some some hints as needed, and kind of guide me along. So, you know, I can only hope that I get an interviewer or interviewers like you tomorrow, because this was very kind of collaborative, very back and forth. Which, you know, kind of it really takes kind of some of the stress off of me as a candidate. So, yeah, I think I think you did a great job. And I don't really have any, any negative feedback for sure.
Ghost Koala: I'm glad to hear that. And that's honestly, like, I prefer these types of interviews in general, because I feel like we know what we do in our actual, like software engineering jobs. It's not like, here's this like contrived from go like code for it's like, collaborating and discussing and identifying solutions to complex problems. And then like, breaking them down into simple addition parts until like, I feel like my opinion, the problem solving this collaboration is like the, my opinion, the coding part of coding is basically to make sure you know, basics of how to code the rest is going to work. Well, with other reasons. Can you comment on problems that are unknown? So that's your whole job, right? So promise with? Cool, enjoy it? Yes, I guess that kind of. So as far as feedback in this prompt, specifically, so like, as I think that was your problem solving approach there are, I think you might find it interesting to kind of go over some of the ways problems and tasks typically go or evolve, as we kind of dig into it. So one particular that I think you might find interesting is we're talking about basically using this array of rules, right? As a data structure to map this that uses the as you mentioned, one byte per id does actually, technically, all you need is just it's your nine to 10. Even the minimum information, either a sentence is one bit. Exactly. Right. Yeah. And there's actually a recall, and I wouldn't expect you to be familiar with this, it's called a bit set. Which is basically it's an array of bools logically, but the way it's actually implemented is it's just an array of this, essentially. And this is basically used what he did for us, when he now has the exact same API's and a report which lives to the dimension, this is actually an approach you would implement. But just thought you might find it interesting.
Winged Avenger: Yeah, that's cool. And I'll look at that. Even if I didn't know about the bitset though, that if we had time, we probably could have gotten to there just doing bit masks on, you know, the bools that I'm already storing in there. So I'd get four spots per bool in my array.
Ghost Koala: Yeah, yeah, exactly. Yeah. Yeah. So that's, um, that's like, my one thing. And then that's kind of one place, you can take this as well, which, you know, obviously, we didn't have more time, but if you if you have more time and want to dive even further into it is like, you know, obviously now or then time is, you know, less than optimal, even though we didn't. So there's kind of like a, you can actually optimize the time without using a ton more space. So basically, what you can do here is basically the problem is what you said you have to scan the whole array and figure find an unset spot, right? So what you can do is you can kind of build a slick, you can call it almost like an index on top of this, of like, so in a sense, you might have like a bit set like this like this right? Before this session was just for Okay. From here, like you said, it's under false that you have to scan through a bunch of stuff to find one, right? But if I can represent for each section or any open business section, then I can skip entire sections. Gotcha, gotcha, right. So I can basically do basically what I can do if I sit. And I have a one bit for each section, what this essentially does before of all, this next section, like I said, this is one or the the handle witness section. So it's a one everything in Section seven, zero, if not, that's a zero. That's in here. I just gotta find a zero here. And then you go in that section, I know we'll find something.
Winged Avenger: Gotcha. Gotcha. Yeah. So that. And then again, the trade off, there will be that you have to update that whatever you have layer on line 50. That'll have to get updated. Now. Also, anytime that you issue a new idea, exactly. We're releasing it. Yeah.
Ghost Koala: Okay, cool. So updates are more expensive, but still not that expensive. And you're searching a lot faster. Exactly, exactly. You can even imagine building a tree of these, right? Where like, then you can take these four bits or for them is zero. Yeah. Right. So then, you know, and then you can basically, so this essentially, it was a bit a bit tree is I think what the state structure is bits, even. But, but yeah, so basically, you can do this. And if you build this tree, essentially, because you have log n things, like what update is log n searches log n. Gotcha. And you're still using. So because of the tree, you're using a little more memory than this, but I think it's less than two, because you have the exponentially decreasing mobile size.
Winged Avenger: Got it. Yeah.
Ghost Koala: I don't expect everyone to get here. But it's just this is kind of where you can go. I thought you might find interesting.
Winged Avenger: Yeah, very much, very much.
Ghost Koala: Cool. What do you have any other questions or anything else? You wanted to go over?
Winged Avenger: I don't think so that this is very good. I really appreciate you and your time.
Ghost Koala: Cool. Awesome. Yeah. Thanks so much for talking to me. You know, best of luck tomorrow. Hope it goes well.
Winged Avenger: Thanks. Appreciate it.

We know exactly what to do and say to get the company, title, and salary you want.

Interview prep and job hunting are chaos and pain. We can help. Really.