Invincible Cloud: Hey. Hello, how are you?
Golden Possum: Hi. Kind of good. I woke up at 5am to do this interview. So, I just woke up and ran on the computer. So let's do this.
Invincible Cloud: Sure, no problem. So, simply introduce yourself, you don't need to reveal your name. But I just want to know how many years of experience and what's your primary language?
Invincible Cloud: So how is your coding skills? Are you doing the recently asked Amazon questions?
Golden Possum: Yeah, so my coding is probably my best part. I do about four interview.io questions a week. Well I interview other people. Yeah, my coding skills are probably the, my greatest asset, I guess, because I do a lot of these. I don't know. I think I'm good at it.
Invincible Cloud: No, good. That's good, though. Sure. So let's jump to the system design problem. So I want you to design a system which can give me a unique ID. Just the unique ID. And we can extend the requirements as we go through each step. Create a system, which can generate a unique ID. And it has to be not just a single caller, it will be like distributed system.
Golden Possum: Okay. So let's create a system that will give me unique IDs. So distributed system that gives a unique ID. So the first thing... distributed, so geographically distributed? Well, I mean, it doesn't have to be geographically distributed.
Invincible Cloud: But let's start with generating a unique ID, and then we can extend the requirements towards that.
Golden Possum: So yeah, unique ID. No one's allowed to have the same one, my first. I mean, the simplest thing to do if it wasn't distributed is have a single counter. And whoever asked for a counter, and the counter is an atomic value, and whoever asked for the counter, just locks it, you increment it and go home, you increment, take and go home. So the easiest way to generate a unique ID. So I should ask, like, this unique ID is it being called by internal API? So is it unique... I'm going to call it unique ID service... being called by clients... is being called by sort of internal services. I think this affects my answer.
Invincible Cloud: So, okay. So it will be called by the clients. So clients will be calling also to have a unique ID.
Golden Possum: Okay. So, okay, so first thing is, I guess we'll do... So how to get... there's an instruction, which increments and also returns. So you're returning the incremented value. And I don't remember what it is. I remember writing going a while ago, and this instruction, so value, increments. And so it's an atomic, in this scenario, you don't have to lock it.
Invincible Cloud: But what do you mean, here? What do you mean by atomic value.
Golden Possum: Yeah, I'm just gonna google it because we got to this part of the giant system design. So let's see, atomic counters... Okay, so that's it.
Invincible Cloud: Or do you think you have a table in the background and this will increment the counter, the primary key, so you have a unique ID, right? So you have the last one. Okay, right? How about that?
Golden Possum: Okay, so there's okay. Yeah, I'll finish my sentence about atomic. So it's a it's a value that, okay, it's a terrible idea for distributed systems. Okay. Atomic value is if there's multiple threads running on the same computer, if they both access it, they will not collide. So like an update instruction will not like collide with a read instruction. But I've been given a super big hint, go with a table. There's advantages with this one.
Invincible Cloud: I'm sorry. I'm actually trying to ask you about the disadvantages of this approach and also yours.
Golden Possum: So yeah, cool. I can totally see it because I just found out. So disadvantages. Atomic value. Yeah. I guess that's what that word means. This is a pretty solution. It's a... I guess I'll put this in advantages. I don't do anything? I mean, one thing that I noticed and this one, I can insert metadata into, for more detailed history of that. The one thing that I noticed in the solution is like logging is kind of built in. Because I'm inserted in the record anyway. So I guess I can put more stuff in that. IDs. Okay. Disadvantages? Let's see. So, my relational database fundamentals are pretty fundamental. I don't know how something like Aurora or distributed like MySQL services handle this.
Invincible Cloud: But it is just a simple table, right? A simple table. Yeah. You can have a autoincrement primary key, it will keep it all and it will tell you how many... say a trillion calls, right? Yes. So can it handle the load? That's the first thing, right? Okay, so the auto increment, it may end at some point, it cannot tell you like, unlimited. Right? So that's the disadvantage.
Golden Possum: Okay, so that's a question I haven't asked yet. What is the scale? Like, I mean, you said a trillion request, right? I think it's load or requests per second?
Invincible Cloud: Okay, so that's when I recommend you for you to follow some you know, some template. So why don't you create the functional requirements, non functional requirements.
Golden Possum: Yeah, right.
Invincible Cloud: A non functional table wants me to, you know how is my system available 24/7? Durability? And how many calls per second, right? And how about another template. So I think I mean, I'm going to put some annotations for you. Okay. So but yeah, please go ahead. That's fine.
Golden Possum: All right. So functional requirements and non functional requirements. Cool. So, yeah, so non functional requirements is like, reliability. So those are, what kind of thing? I'm gonna put the words how often does this service...? Or no, I'm trying to formulate, so I can say, what kind of bad things have... So how often? How much? What are the uptime requirements of this service? So 24/7, 99% kind of deal? So you know, this theory of five 9's four 9's? Okay, good.
Invincible Cloud: Okay, yeah. So, it. I mean, I know the theory, but I'm thinking how does it apply to like, whether I use like, which relational database or? Yeah, cuz, for example, if I ran out of a sequence, yeah, if I had that many calls, and I had a sequence, and it hit the end, what would my relational database do? It would just crash? Maybe I could roll over the IDs. That may be a manual thing on my end, so I have a service I would like, just roll the ideas forward. Let's see. How long does it take to create a table like Postgres... not very long, actually. So that could be I mean, at this point, my ID is no longer unique. So that's a that's a disadvantage. And I know like in Aurora, like MySQL is distributed platform. you commit to an append only log and it's distributed, and eventually committed. So how long does that take on the right ahead log compaction take? And does it affect the starting time? I have no idea.
Golden Possum: So that's what I'm thinking we ignore that for a while. But let's think about this right? Requests per second non functional requirements. Okay. So maybe, you know, 1000 requests per second, right? Or let's pick something simple, just hundred.
Invincible Cloud: So enough for my nonfunctional clients, I go back and I say are disadvantages. Cool. Okay, so I'm going to move down to line 35. So disadvantage, I can think of is with 100 requests per second. So I will just say MySQL is a bigint is eight bytes. So which is a really big number. I'm assuming as corresponding types rather database can store eight bytes. So eight bytes, which is eight times eight. Right? Yes. Yeah. So what this means this is a huge number... with 100 requests per second. I'm assuming it's a huge number. So this 57 zeros.
Golden Possum: That's fine. You don't need to go into the detail, but just tell me what was the disadvantage of having a table?
Invincible Cloud: Yeah. The disadvantage is I think it's a little bit storage overhead. I don't actually need that record, assuming I can ignore the logs and stuff that when super performance, I don't need to save the record. The sequence may run out. That's two points, actually. So don't leave the table record.
Golden Possum: So what do you mean, you don't need the?
Invincible Cloud: Like, if I guess if I'm inserting records into a table, now I have a table filled with stuff I don't want. Or I don't need any more. Yeah, it's like, maybe one... was eight bytes. So it's an eight byte record.
Golden Possum: So you're saying that, let's say if you generate the number two, but you are using 8 bytes just for storage?
Invincible Cloud: Yeah. And eventually, if I have like, a million of those, I'll have eight million bytes, which is not that bad.
Golden Possum: So let's say, you know, if you want to have this as a distributed, right, so which means only one table, you will, like multiple callers will be calling this only table. Right? That will be a lot of latency. And the other thing is, so let's write that point in. So distributed environments may have latency. That's another thing. But let's say if you can, if you can, if you want to handle it by having your multiple tables, right, right. But you may end up in with duplicates. Yes.
Invincible Cloud: Yeah, of course.
Golden Possum: Okay, cool. Okay. All right. So, yep. What's the next solution?
Invincible Cloud: Yeah. Okay, so. So I have like another like, key idea. So I can keep the table concept, or I don't have to. So when you said yeah, you may run it through please multiple tables. I just realized, like my integer range for the big integer is like a byte by two. I can, if I want, I can have multiple tables. So I guess we'll have the next solution. I think this merits another solution, maybe I should just put it next. Small, tuple tables, each with an integer range is an integer range. So it's a multiple relational databases. And we're just gonna say, multiple tables. Each responsible for a range of integers. So in the same fashion, same REST endpoints. Is given the auto increment of multiple tables. So disadvantages. Yeah, I can give you let me do some math on that. 100 times 64. I'm gonna do the math on the request for a second. So I can like estimate orders of magnitude.
Golden Possum: Okay. Cool.
Invincible Cloud: Okay, so now I have another question. How long is the service expected to be up? Like, forever? Is it like a year? Is it like maybe a month?
Golden Possum: Okay, so say 30 years.
Invincible Cloud: 30 years? Alright, so in math calculations, it's not a huge number. So this is the expected... However, it's not like so... Okay, so I have another question for what is my... I don't remember what the word is. But you say 100 requests per second, this seems like kind of reasonable, what is my like range for estimation? So like, do some days I get 200 per second? Or some days I only get 50? What is my, like, upper bound for requests per second, instead of? Or was that the upper bound? So was 100 per second the maximum requests I get, or will it fluctuate? And will I have like?
Golden Possum: Doesn't matter.
Invincible Cloud: Okay, my total expected, apparently comes to 94 billion. You know, we don't have to we can be...
Golden Possum: You said for 30 years, right?
Invincible Cloud: Yeah. So let's put it this way. Right. So yeah, you don't need the exact calculations, but just come up with a you know a variable or something. Right. So yes, 94 million, or, you know, 1 billion.
Golden Possum: No, it's 94 billion.
Invincible Cloud: That's 94 billion. Okay, so one trillion, right? So come up with that number. That's fine. But...
Golden Possum: So it's roughly 1 trillion. So I'll just...
Invincible Cloud: So do you think whatever the solution is that you covered it just now the table, atomic value, or anything that will work for this number?
Golden Possum: For this number? Yes. Because the bigint is well within range of that. The big int is 57 zeros? I mean, maybe I yeah, like an unsigned begins has a huge number range.
Invincible Cloud: Okay.
Golden Possum: So for example, if we're being very obsessive, so divide the integer a lot. So yeah. Yeah, that's a huge number. All right. So let's say I have a... wish I describe a number of things. So let's say for example, each table gets their... So this is like similar to the... some really a circular or it's like, consistent hashing maybe so to one gets like zero to one t and then two gets two t.
Invincible Cloud: Okay, so let's think. Okay, so let's think this. The solution is that what we have right now is a table but having a multiple table, right? The solutions that we have right now that may not work for our service. Let's see if I increase the 100 per second to 1,000 per second is not going to work, right? Or maybe you know, 10,000/second. So let's come up with something, you know, that's some other approach that scales better.
Golden Possum: Okay. So what other solutions do we have for? So should I keep going with I have a database with records? Or can I now design my own auto increment key?
Invincible Cloud: Yeah, I want you to go with the second solution.
Golden Possum: Okay, so the relational database thing, I had eight bytes per row, which meant that if I had, say, if we ended up scaling this to 1000 per second... But are the storage costs negligible? Alright, so 1 trillion bytes.
Invincible Cloud: So we're thinking about this... do you think you can return the epoch time
Golden Possum: I don't think so. So seconds, since 1970? That would mean the two people who requested the same second which are roughly 100 in this scenario, I can't return epoch. Could I return a more precise timestamp? Like nanoseconds? Depending on my frequency, so solution, return the time value... So we'll do advantages or disadvantages. Almost all programming languages have precise time libraries needed to implement this. So I'm gonna do an order of magnitude. This just advantages calculations like I need a really precise time function to scale and even then there is a non zero probability that multiple IDs with the same time.
Invincible Cloud: So think about it. Like how can you avoid that?
Golden Possum: Okay, so I need a really precise time function. Is there hardware to do this? Is that part of this question? Is there like, time hardware?
Invincible Cloud: Not a hardware.
Golden Possum: Yeah, there's computer functions.
Invincible Cloud: There's some small tweak that can actually handle this problem. Right. So it's a simple, you create a timestamp, but including timestamp, you will also do something else, right? You'll be calling it. Why don't you have you know, concatenate user or something?
Golden Possum: Yeah. So I could append user specific information or a hash value, or a geographic key. So like, basically, each host should be given a key. So like, basically. So it's like, timestamp, node hash, user hash. So this means in order to even have the probability of generating a unique key that the user has to do two requests at the exact same time to the same server. Okay, is that is that acceptable? Like?
Invincible Cloud: Oh, yeah, definitely. Yeah. So think about it. The same user can be can call your service multiple times at the same second. So how can you handle that?
Golden Possum: I mean, I can implement some sort of to like, priority queue but that sounds awful. Yeah, I can basically say like here's a queue of like this many times. These are called dispense IDs one at a time. That sounds awful. So how do I? I mean, I can have a vector clock in the back? Or no, is there? Also a vector clock? Increments for each ID. If I have the vector clock, I don't need the rest. I'm assuming...
Invincible Cloud: What do you mean each ID?
Golden Possum: For each request.
Invincible Cloud: Request for a user right?
Golden Possum: No, just for a unique ID. So every time you want a unique ID, I increment the vector clock. And then append that number to the end of the timestamp. So like, I know. Yeah, just basically make the timestamp unique.
Invincible Cloud: Okay. Do you want to write like one example for me?
Golden Possum: So hope there's notation for... So V is one request. Like this timestamp.
Invincible Cloud: Okay, so yeah. So my other question is, so this, you were going to use the vector clock? If it is at the same nanosecond, right?
Golden Possum: So checking at the same nanosecond, I think...
Invincible Cloud: So thinking about this, right. So we have 123456.
Golden Possum: So why do we need a timestamp when we have a vector clock?
Invincible Cloud: No, I'm not asking that. So you see, at this, you know, at the same timestamp, you could get multiple calls. And you are going to use this vector clock.
Golden Possum: Yes, I was going to use it for all of them. But in retrospect, yes, it makes sense to use them for the same timestamps.
Invincible Cloud: There is no use of like for different timestamps. Right? Okay, that's good. Okay, that's what I want to know. So that's what you want to know. Which I mean, so you came up with multiple ideas here. So the first thing first, right, so I'm going to write here. Okay. So the first thing is you said, you know, timestamp, class user ID, right. The first thing, and the next thing you came up with is timestamp. Okay. User ID. And the next one is timestamp plus userId plus, you know, user IP address or something, right? Yeah. This is another one. And the third one is, you know, user ID, user IP address, and you know, this one, right, which is what you want to do. Yep, value plus. Okay, that this is great. So this is what I'm looking for. So you will try to make a vector clock. Well, if we said the same time timestamp, then you're going to simply increase. Okay, so, let's, okay, so I think we are almost at 35 minutes, so let's. So now I agree with this solution. So now I want you to design the system, no detailed design. I want you to handle let's see, how can you scale it and everything. So let's move forward. Yes.
Golden Possum: All right. So, okay, so, design this specific service. So alright. So I have... So is this part where use like a diagram sort of deal? Like?
Invincible Cloud: Yes.
Golden Possum: Okay. So we have a REST API. So I know there are faster ways to do this, but I'm not... Yeah, I should totally Google those. But currently, I'm a web developer. And all I know is my HTTP verbs. So we'll do get unique ID. So, here, we can have a load balancer. So we can go into a load balancer. And then to like one of many, one of many services, basically, this is kind of slow, because it's like, basically relative to the same thing? Maybe? I'm thinking we have one monolithic load balancer and all the requests go through that. But do we need a load balancer for application? Or is it like, we've designed something where it doesn't need one? I'm going to say that.
Invincible Cloud: What are you doing? You don't need the load balancer? I guess, because in my mind this is distributed. So because this is distributed... So it's all its geographical. Many users can call the same API at the same time, but is this correct? And without issue. Or do I need it for, for example, other reasons? Like, I don't want one of my data centers being the only hot one. Okay, so I will use that balancer. Alright. So we have my load balancer. No, we have three. I'm trying to think of, so I have a scale. So at this point, I want to say per continent?
Golden Possum: So what type of load balancer are you talking about? Software, hardware?
Invincible Cloud: Oh, yeah. I don't know what those words mean, because I have not... But when I think a load balancer, I'm like, What is it called when your load balancer just simply has four services? Like the four services? So this is like... Yeah. So it just picks between them? No doesn't pick it just does... like it just evenly distributes it. Yeah, it doesn't. Doesn't analyze traffic is just like a request between three different nodes. That's the type of load balancer. I don't know the name for it. Okay. So we have... Okay, so here's another question. So I can have one. I can have one load balancer, global or one per continent. So everyone's going to call everyone... any rational actor would call the load balancer on their continent. Because this is good, so I can assume that right? So this is like geographically spreading as well.
Golden Possum: Don't you think that the single point of failure is only one?
Invincible Cloud: Yeah. Okay.
Golden Possum: Okay, that's fine. But you know, right. So you know, how to and so you maybe want to have another one which is an active passive or active active, right. So maybe, maybe not, don't don't include that count? Or maybe I know I need to load that or something. But just say you know, I have another set of load balancers in the background. We can be one act two intensity or something like that, right? Yes, exactly. So avoid a single point of failure.
Invincible Cloud: Okay. So then I'll draw my diagram here. All right, so I think...
Golden Possum: So you have a client. Client will call the get request. So what exactly will happen? So don't you think it will go to the nearest? I mean, the DNS, DNS will resolve. So I want the complete structure. Can you do that?
Invincible Cloud: Okay, so then there's something in front of the get request, basically, was direct action...
Golden Possum: You can actually use the draw function here.
Invincible Cloud: Okay, let's see. So, yeah, I'm still in HTTP land. So how is DNS going to resolve my service? All right. So my idea is that like, we'll have a continent one, continent two or some other geographical distribution of DNS records. And that will point you to the correct...
Golden Possum: I don't want to know the details about the DNS. I just want to know the client requests come from the client, it goes to the nearest DNS and DNS resolve IP address goes to the load balancer. Right. I just want that. Yeah, that as simple as that, because we have to coordinate on scaling in later service.
Invincible Cloud: So yeah, each DNS is like to a specific load balancer, which adds three nodes. So each load balancer has n nodes, I guess we'll call it. Yeah. So that's their texture that I guess for now? Yes, we're here. Make it a bit smaller. Okay. There's the architecture of the request. So you have your client send a get request to continent one, two, or three, the DNS resolves to the IP of a load balancer. So now we're the load balancer, the load balancer will distribute all requests evenly. I don't know the word is, but to each node, a node itself will have... Okay, so... Alright. So we'll draw a node.
Golden Possum: No, no, it's me. I kind of connected from the other iPad.
Invincible Cloud: Yeah, cool.
Golden Possum: Okay, so let me put this. Are you writing something because I'm not able to see it?
Invincible Cloud: No.
Golden Possum: Sorry. So let me... Here is your client, right? Yep. And we'll make a connection here is your DNS resolver, right? Yep. And there is a load balancer. And you have a backup of load balancers here. Right. And from here, you know, you have all of your services. The distributed is right now. All right. So maybe for a scalability issue, you can have like, you know, all the different micro services in somewhere in AWS. Right? Yep. So you're gonna call s one s two. Right. And here is your... so the algorithm that whatever you just created right now, yep. So that he is your maybe you can call it as the KGS. Right? Key generation service. Right. So what are the single points of failure at this point? KGS. Very obvious, really only one. So you can have, you know, this is obvious, right? Yep. That's it. So this is the small diagram. And right. Yeah, so I want you to draw this. Yeah, exactly. Yeah, please continue now. Or you can edit that if you want.
Invincible Cloud: Yeah. All right. So I know it was multiple KGS. So yeah, we have multiple key generation services. So what I want to do now is like, sort of elaborate, internally how it works. So the key generation service works internally where it assigns you an ID or a timestamp. And it holds in memory that timestamp for let's say, for some amount of time, with its associated vector clock. Is that is that the stage where I should move? Because the diagram is roughly drawn already. So now I'm elaborating, like, yes. So. So the KGS.
Golden Possum: So I'm sorry? What is your name, by the way? Oh, my name is Marvin.
Invincible Cloud: Marvin. Right. So let's think about this come to come to the draw, I want to let you know. Another thing here. So in this diagram, right, so we have a KGS. Right? So think about like the single point of failure. Okay, so what can you do is I want you to ask you about the latest. Right? What can you do so that my you know, it will have a minimal latency?
Golden Possum: What can I do? What is the source of... I'm trying to think what is the source of latency now, so we can have better geographic distributed services? We, so right now, we don't have any locks. Right? It seems pretty quickly. So that generation services performance, so there's the optimization, right?
Invincible Cloud: So you have the vector class, you have the app. So you said you know, that clock, you can have increased that counter until finally maybe. Yeah, and then when it goes to a new nanosecond, you're going to reset the counter. Again, you can start from zero, right. So you already had the basic. So do you think you can add a cache, right?
Golden Possum: Can I add a cache?
Invincible Cloud: Hmm, yeah, definitely. Right?
Golden Possum: To which service, it's an ID service. I can't cache an id. I need a new id every time. What can I cache?
Invincible Cloud: So now... let's think about it. So now the interesting part, right? If you add cache, or why is there a need to add cache? Okay, so that's another thing. So before you call the kgs, let's think about this. Right? So kgs will return you know, it will... So maybe I should I should I think I should draw something for what I'm trying to explain is. Okay, so. So if the network call whenever your micro service and we have one server called KGS, right, internet, put all that services, and killing, so invoke whatever the method has, and then get back to you. So that's a network delay might happen, right? So yeah, if we use a Redis class cache here, or some some some type of cache, okay. In between your server s one and kgs. Okay, what is your cache? You call for every 1000 calls, call your kgs for only every one one in 1000 calls. Okay? If you call one in every 1000 calls, okay, maybe I'm going to call and get all the 1000 unique IDs from us put it in your cache. Oh, you know?
Golden Possum: Yes. I forgot about this. Okay. Yes.
Invincible Cloud: Thinking about this. S one we do the same, s two will do the same. So on the first call, right? It will make a call it will make a trip to kgs to get all the 1000 and put it into this memory. But s1 until all of these 1000 unique IDs are of, it will using the same ones in its own cache. So let's say yes, one went down. How many keys? Are you losing? 1000 keys? It's nothing? Yeah, right. When you're talking about like trillions of keys 1000s it is nothing.
Golden Possum: Oh, I see. Okay, yes. Thank you.
Invincible Cloud: Right. Yeah, go ahead.
Golden Possum: Well, no, I have nothing interesting to say I didn't think about it before but now that I've taken over this. Yeah. So that's a way to do we can cache the keys ahead of time and distribute them with like in memory key value store. Okay. Yes, this is about the limits of my knowledge. You can end it here. Because unless there are more things, this is super useful for me, by the way, but I am drawing a blank like when I see further optimizations we can like we can calculate keys ahead of time in cache. So I'm thinking. Do I need to make this more distributed? Doesn't seem like it. Can I make it faster? Without, like pre generating keys? No. So to me, this looks like I mean, I just want more trade offs I could make this is real interview, I would certainly try and make trade offs. But even then, I'm struggling to think, what could I? How could I tweak the system for specific use cases? I don't, currently. No, no.
Invincible Cloud: That's fine. So the thing is, right. So if you have in the real interview, right, so if you have a 45 minute or 60 minute interview, the only thing is you have to keep on talking, right? So keep, like, ask your own questions, like ask the questions, as many questions as you can. Right? Find the faults in your own design. Right? Try to find the faults in your own design and try to come up with a solution by resolving by yourself designing a system they accomplish with some employees. But the thing is, the way you approach and the way you solve your own issues, that make sense. I mean, that will make a lot of sense. So I'm going to put up a review for you. That's fine. So let's come back to the diagram, right. So you have KGS, you have all of these microservices who are easily deployable, so it will take care of your load balancer will take care of, oh, if it is deployed in the AWS, you don't need a load balancer to, you know, to on our side, right, that's another thing. Unless, if you deploy it on our on prem, then you need a load balancer to deploy it in the AWS or some kind of a storage, I mean, or Kubernetes or something, right? You don't need a load balancer because that's a cloud. So Cloud will take care of the load balancing issues. Right. Okay. That's good. So let me maybe I'm going to address this a little bit. Okay. So, another question is, so what what will happen after this right, so the call it goes to the KGS? Yeah. So, it will get the response it will be back to the caller. Good. So what can happen after this?
Golden Possum: So service one, just service one... what's like the call goes from service one, and then it goes through with a load balancer goes back. Good. Back to the client.
Invincible Cloud: So there is another thing here. Okay. Since this is distributed, right. And we have a lot of, you know, the time. And we want this to be distributed like all around the globally, right all around the world. You're talking about taking a timestamp? What if, because, you know, people in Southeast Asia or Asian continent and the houses all of this time delay, right?
Golden Possum: You're like one. But what's the Universal Time? Yes? GMT. We do our timestamps in GMT, of course, so if I can write that GMT.
Invincible Cloud: So you may end up with duplicates when you have all of that, but you are saying that we can use only GMT.
Golden Possum: So, like the client doesn't actually care about the time he only cares about the unique IDs. So if we have geographically distributed computer centers, as long as they're still using like, epoch time, they should be fine. As long as they are using only one timezone, the client doesn't actually care about the timezone, and the key generation service can make sure like all its distributed services never... Wait.
Invincible Cloud: But you have multiple KGS deployed in multiple continents? Right? Oh, yeah. It will not go well.
Golden Possum: Right. So how would you handle that? I mean, reusing the cache strategy like to me, okay, we're using the cache strategy now sounds really useful. Because we can have one KGS... Well no, that's a single point of failure. We can have KGS with backups. Oh, no, we can have. So like a KGS service. And the KGS service, of course, has backups in case it goes down, but then we just have a bunch of Redis servers that are geographically distributed, which are continuously requesting keys from the original kgs. And that seems like an ideal architecture, because we will never get repeats. But all of our clients around the world will always have the same... will always get a speedy response time assuming nothing explodes or their rate of keys. So I guess we can have a... Why is my drawing tab is not? Maybe? Yeah, my drawing tab has since decided not to cooperate. But yeah, so we just have a bunch of Redis servers around the world querying one KGS and caching the results over time. I think this is ideal because we never get key collisions. Is there anything wrong with it? For example, the rate of requests goes up the Redis servers could run out. It doesn't seem too difficult to do a really high rate or at least a varying rates if I detect, there are more keys or more requests coming in, I can increase the amount of keys that I currently have. Does that sound useful?
Invincible Cloud: Okay, so what are the other drawbacks to the what are the other drawbacks?
Golden Possum: I guess the largest one is that it's there... If like too many people are requesting keys, then the Redis servers basically become slow because the KGS is not geographically distributed. It's in one place with one time, so it's in one place. It has backups or will not go down very long.
Invincible Cloud: But so don't you think the Redis cluster can handle the load for you.
Golden Possum: But if there are too many requests than the Redis each node will run out of IDs and we'll have to ask KGS but I guess we can pre cache pretty far ahead with that. So if we don't have to worry about that, then it looks, this architecture looks very good to me. Let's see what our problems with it. I don't know. See problems with it? Our single point of failure is obviously like our one KGS. But if we make backups for that, so we have our KGS, I don't know, like in Europe somewhere. Yeah, we have backups for it. Every Redis server one in North America or South America or something will ask the KGS for a key and distribute them quickly. That... trying to poke holes in my own design solving some problems. Is it fast? I think so. Because your pre caching IDs. Is it reliable, we can make backup KGS, we can do, we can failover Redis server stuff like that. Performance... KGS itself is quite performant. Redis is quite performance. The only point of point of failure that doesn't involve backups or failures is the fact that too many people request keys. But this can be solved. Because we're dealing we can request a high volume of keys to put in a problem, we can lose keys if the server goes down. So let's see what are the trade offs you always have to make for distributed system? So there's like, well, this is kind of a distributed system, but it still has like, a single source of truth. KGS. Yeah, thinking like it, what are analogues to other services, which has this problem, but we basically have append only service. So we never have to resolve conflicts. You know, we have append only. What's the word for it?
Invincible Cloud: Okay, so yes, that's good. So just going to end the interview. Before that, I want you to tell me about the HTTP status codes that you are going to return, you know, if you know if the connection is invalid, or if your service is down. So how are you going to do the exception handling or award you're going to return? You know, return the status codes to the user.
Golden Possum: Okay, so I don't remember all my status goes. But if something goes down, it should have a hot failover. Everybody has a hot failover this. So I'm not anticipating throwing any 504 records, of course, you eventually can, if everything goes down, you can throw 504 records. Let's say somebody has used all the IDs of the Redis cluster, I think. I mean, if someone has used all the IDs of a Redis cluster, you can either return resource unavailable. Okay. But you can either return that, or you just have to wait for the next one. So yeah, we have internal service failure. But this is the client isn't requesting. I mean, currently, we're not asking the clients to limit their request. So I'm just never going to get like in what's it called? What are they called for? A 400 is when the client sends bad days. So the clients will never see a 400 issue status code, because we're currently not asked to rate limits, or they're not giving us any information. So we'd have like 500. So I don't remember exactly my 500 codes, precisely. But we could have a 500 in case our services actually go down and there's no backups. And maybe if we want them to know really quickly, whether or not they have an ID or not, we can send a 300 like resource temporarily unavailable if we run out of keys. Does that satisfy your HTTP curiosity?
Invincible Cloud: Yes. Okay. So I think that's good. So it's quite long. Okay, sir. So I only did the annotations. Okay. I will add a bit of feedback as well.
Golden Possum: Okay. Thank you very much. This is extremely useful.
Invincible Cloud: All right. We'll see you then. Bye.
Golden Possum: Bye.