Watch a technical mock interview with an Amazon engineer
Infinite Shadow, an Amazon engineer, interviewed Quantum Badger for a System Design interview
Share
Summary

Problem type 

Online file storage

Question(s) asked 

Design a file system storage like Google Drive or Dropbox

Feedback

Feedback about Quantum Badger (the interviewee)

Advance this person to the next round?
  Yes
How were their technical skills?
3/4
How was their problem solving ability?
3/4
What about their communication ability?
3/4
I am inclined to hire the candidate. The candidate has discussed multitude of things and has shown depth of knowledge in various part of the design, especially in - Gathering requirements (both functional and non-functional) & capacity estimates. - Database design - How availability & consistency can be guaranteed. - Caching Good understanding in building micro services & distributed systems architecture. Improvements - 1) There was some back and forth on how the consistent hashing works. Is it for the write servers or the database servers or both? We lost some time there. 2) Time management is one important aspect for design questions. I would recommend to finish the requirements & capacity estimates in 8-10 minutes and have good amount of time for discussing the meat of the design. 3) Minor - Regarding database design, I was expecting you to come up with 2 different databases one for metadata & other for object datastore, right away. 4) File chunking I was expecting to be done on the client side rather than the server side. I would recommend to solve some design problems on your own and look up the solution and see where you could improve.

Feedback about Infinite Shadow (the interviewer)

Would you want to work with this person?
  Yes
How excited would you be to work with them?
3/4
How good were the questions?
4/4
How helpful was your interviewer in guiding you to the solution(s)?
3/4
Not sure about how mentorship session is conducted, but I was probably expecting both the interviewer and interviewee to work together to solve a problem. More like a conversation. Time could be challenging. Nonetheless, it was helpful to think through and get feedback from a professional interviewer.
Transcript
Infinite Shadow: Hey, how's it going? Quantum Badger: Good. How are you? Infinite Shadow: Doing good. So this is for systems design mentorship practice. Quantum Badger: Right. Infinite Shadow: Gotcha. Can you provide a little background of yourself and where you're in the interview process right now? And what are your expectations from this interview? Quantum Badger: Sure. So I'm a software engineer, and in the data side, so you can call it, you can call me data engineer. And I have had over 4 years of experience. And I'm right now working at Lyft. And then I have a few interviews coming up next week, for onsite. And, over last year, right, I did take few interviews, and then system design is sort of my weakest points. And so I want to make sure that, if I'm following the structure well, or I want to see where I'm missing things. That's why I choose to do a mentorship interview. Do you want like, expectation from this call is to make sure like, you know, to understand from the other side, like what is expected out of system design? And what are the red flags? What are the things not to do and how to how to structure it and how to go about it. Obviously, I followed the structure given by grokking the system design interview, and I think that I failed because of following the structure because, and I thought to myself, like, you know, going with intuition is better not blindly following the predefined structure. So this is what I have in my mind at this point of time. Infinite Shadow: Okay, so if I understand correctly, did you do system design interviews last year, and it didn't go well? Quantum Badger: Right. Yeah, I think it's because coding wise, I'm pretty sure I'm okay with that. I think system design is where probably, I might have offers rescinded because of that. Infinite Shadow: And for data engineers, like what is the bar for system design interviews? Quantum Badger: I think I have an interview with instacart. And they have a dedicated round for system design. Initially, like, mostly people don't put system design for data engineers. But this is in the data influencing. So the bar is same as software engineers, and I have software engineering background as well. So the bar is pretty much same as software engineers. Infinite Shadow: Okay, sounds good. Sounds good. Want to get, you know, in the right direction, and get the feedback? Quantum Badger: Right, yeah. Infinite Shadow: Okay, sounds good. So, do you want to like solve a problem? And then you, as you solve it, do you want me to go through and what are the expectations are? Quantum Badger: Yeah, maybe that sounds good. Infinite Shadow: Okay. Designing file hosting service, similar to Dropbox or Google Drive? Quantum Badger: Okay. I don't think I have done that before. So probably that's probably a good idea, I guess. Have I used what? Infinite Shadow: This platform before? interviewing.io, yeah. Quantum Badger: Not for system design. Infinite Shadow: Gotcha. So there's just a drop cap. That is the only difference with coding rounds at the bottom left of the screen where you can draw high level or detail level design diagram. Right. Okay. Yeah, and also, let's target to compete in 40 to easily 45 minutes. There could be a good move for majority of the companies. Quantum Badger: Okay. So the idea is to build a file hosting service, like Dropbox or Google Drive, right. So, the first thing I would like to start is go with the use case as expected. Obviously, this is for interview sake and we can't. Like we have a limited time. So to understand like functional requirements of what is expected out of this. So, to begin, right, this is Google Drive. So for accepting use cases, should we should fill? Is this something I have to come up with? Or should I ask the interviewer to give up to give use cases for me like to see what he's interested in? Infinite Shadow: Yeah, so the, usually the expectation from the system design interview, at least in my sense, it is, basically, you will be given an ambiguous problem. And what I want to see is, how does the candidate deal with an ambiguity, let's say, once you join a particular team, I'm already working at least, I mean, that is a really good company. And you might already know about this. So basically, you will be given a design or a mini design in your project, and most of the time, you will know where to start with an arrival, you will know those requirements, and what are the design goals? The latency requirements? What are the other services that you have to interact with? Or make changes to? And what are the milestones and stuff like that there is a goal, like how do you deal with ambiguity? And how would you, you know, provide structure to the problem, which you're trying to solve, where you don't have any context of it. So most of the time, I would recommend the candidates, you know, to come up with their use cases and ask clarifying questions. And the interviewer should able to correct and or add on to the requirements that you provide. Quantum Badger: Okay. Sounds good. Yeah. Okay, then, in that case, Google Drive, the primary or the primary workflow, or the use cases is to upload a file. And that's the, I'm just gonna write a few features here. Upload a file to drive when we talk about a file, then probably we need to discuss further on what kind of files we have to upload, we have the support, it will be text files, or media files, or like, you know, depending on that. So I'll touch base with this. Once I'm done with the rest of the use cases, the first thing is uploading, uploading a file to drive. And once we upload, the obvious next step is to download or like to fetch the file which we uploaded to the drive to download a file from the drive. What else? What else? What else are the use cases apart from uploading and downloading files? Infinite Shadow: I believe to share the files with its subset of users. Quantum Badger: Sharing files with others. And then there is also a feature called syncing, right, like you can you can sync a particular drive or folder. So do you think that that's something we can handle, or is that out of scope? Infinite Shadow: No, that is in scope. Yeah, that's that's definitely a great feature, like syncing across multiple clients and also syncing a folder on a given client. Quantum Badger: Okay. All right. So is this a good number of use cases to tackle or? Infinite Shadow: Yeah, I think these these sounds good. Quantum Badger: Okay. When it, like, you know, I mentioned I would iterate on this. So do we care about any file types? Or does it like, are we particularly identify a particular file, file type? Or should it support anything? Infinite Shadow: Yeah, it's just about anything. Quantum Badger: That's a generic file. This is functional requirements. The next part is non functional requirements. So here, we would like to talk about scale, right? So when we when we talk about scale, we need to know how many users are we expecting, right? Like total number of users and daily active users, so do we have any numbers for this? Infinite Shadow: I mean, you can you can give it a shot and we can iterate on it. Quantum Badger: Okay, so probably I think 500 million total active, total users and daily active users probably we can say 10% of the total active users. So 50 million. So this is a lot of users. So this has to be highly available system. And so talking about in terms of nine, I think, four nines is a pretty, like, pretty good system to design. Obviously, we can go for five nines as well. So this is a pretty is is a good, good figure, or should it has to be five. Infinite Shadow: I'm playing with four 9s. Quantum Badger: Okay. And then, so one more thing is, is this system going to be write heavy or read? Read heavy? Right. So we would probably, it's going to be read heavy, I believe. Is that true? Infinite Shadow: I mean, it may not be it can be I mean, depends on the use cases, I guess. But I will say the Read and Write are pretty much balanced here. Because some customers just want to write file and make use of them. And some, you know, like to share files, and you know, so yeah. Quantum Badger: Okay. So again, now, this might be a dumb question. But like, you know, in Google Drive, we can actually view it in browser. Right. So do you think that that could that could be considered a use case? Or is it is it part of downloading a file? Infinite Shadow: Yeah, I'm not too concerned about it. To be frank. I mean, I'm mainly concerned about the design, like how do you how do you, you know, is reliable and available and also durable? Quantum Badger: Okay. Okay, so, in terms of, obviously, in terms of latency, we expect, like, you know, we expect the downloading part to be quicker. And since the Read and Write ratio is similar, so we we obviously expect this to be a very low latency. And one more important thing is consistency. And basically, the data durability. Since, like, since we are when we upload something like we assume once it's successful, we assume that it's going to, it's going to stay consistent and durable over that. So I believe, like these two, durability is one of the most most important thing. Coming to consistency, let's say if the file is being, Oh, do we care about strong consistency or eventual consistency? And the question is? Infinite Shadow: I think in this case, eventual consistency is fine, especially if you're thinking across multiple clients. Right. Quantum Badger: Okay. Yeah. Okay. I think I have I've got a good idea of in terms of like, how big the system is, and I think I've captured most of the highly available or like the the non functional requirements here. So, the next thing is like, do you want to, they want me to do the estimates for you know, the bandwidth estimates or the storage estimate? Infinite Shadow: Yeah, that would be good. Quantum Badger: Okay. So obviously, we have 500 million users. So per day basis, how many how much so if we talk about per day user data, like, you know, how much does a user upload per day? So probably 5mb does that sound good for you. Average data uploaded by user per day. Yeah. Infinite Shadow: Yeah, that sounds good. Quantum Badger: Okay. So if that's per user, then obviously, we have 10 million active users. So 10 million into 500. terabytes. So that's easily 5k. And now, so 5000 terabytes, which means five petabytes per day. I think this makes sense. So we are talking about five petabytes of data storage per day. Whereas I have not defined any API's yet. So should we like, do you want me to ideally, do you want me to go ahead with the high level design or drag it into the API's and try to get the network bandwidth required for upload or download or whatever? The other API's, which might be interested? Infinite Shadow: When you say API's, like APIs use for the clients to upload the data, right? Yeah. Yeah. I'm not super interested in API design. Quantum Badger: Okay. So we can go ahead with I mean, the main thing is, since this is a drive, right, like this, the main thing is how much data we are uploading. So this is per day. So this, I think this, this might be enough. I can do estimates for five years or for one year as well. Do you think that is required at this point of time? Infinite Shadow: It's okay, we can revisit that later. Quantum Badger: Okay. Oh, all right. So, let me see I have not used this feature. Okay. So, let's assume we have a client here. Considered not that pretty to use the tool. we need. Okay. Yeah, I'm not able to can do not able to delete it. Infinite Shadow: So there is like, the circle like the third icon from the bottom or sorry, the third icon from the top. If you found it the circular one that is there is a different one. There is the second one, which is there is this one. Yeah. Quantum Badger: Okay. Oh, the third from the top. If we expand in the second one, which looks like an eraser, this one okay. Infinite Shadow: Oh, this guy probably would help. Yeah. Quantum Badger: So here we have a client, right, we have multiple clients. And these clients can be using a web browser or mobile device. And here, there is, at this point of time, there is no need to create a persistent connection. So I'm going to create this connection as a need basis. So regular HTTP should be good. So this, the client connects to load balancer. Once we connect to the load balancer, we need to have we can have microservices one for upload, one for download. Let's say this. Let's call this write service or upload our. I'm just gonna design the monolithic kind of service and then I'll try to make it highly available later. So when a client connects to load balancer, and from here, we will use load balancer calls, particularly our API service, implemented using probably a restful service. And then basically this when we'll upload it to a database. Okay, talking about specifics here, so based on the file sizes of the file size... So obviously, here, we would, like when the client asked to download a particular file, or when he wants to share a particular file with other when he shares this particular file with others, they want to download it, or they want to view it. And we can't obviously hit database every time a user wants to read a particular file. So we can have we can maintain a cache in between right and the database. We'll just write it like this. The cache can be Redis, or mem cache. And this can have probably the most, they can, like we can use algorithms such as LIFO, to make sure that this cache stays consistent. So whenever we want to write whenever a user writes a particular file, I realize I haven't discussed the schema of the service yet. But whenever a user wants to write a particular file, he writes that he writes particular first to the database, he checks if he writes to the database, and then later the process also needs that particular file in the cache. So late, and then it sends acknowledgement to the client. So the files in the client has been successfully stored in the database. When someone user wants to each file will have a URL, right? You are right. So when someone else wants to download that file, so instead of hitting the database, we can just fetch it from the cache and return it. Since this is just read only. The bottlenecks here is obviously we need to have this service as stateless. So, we need to have multiple instances of it. So that we can use some hashing algorithm such as consistent hashing to map particular users files to one of the services of the write services. Therefore, dimension, so there will be a read service as well. Which basically interacts with the cache. And if there is no data, then only it goes and hits the database, and then it updates the cache. And then it returns the data back. And here we are just using like no request response protocol HTTP since this is a need based kind of system. And then one more bottleneck is the database side the choice of database for the choice of database. So we will have each user will have a user ID and he will have a particular file. So instead of instead of storing the file directly here, because database, like if you're using a relational database, it's not meant for it. But here are things we are just talking about... As usual, uploading your file, and then it can be updated by many people. So there is there is that problem like Google Drive, you can multiple users can work on a single document. So for that reason, if if you were to use if you were to use a noSQL database, just a key value store, I think we might have problems with that because of consistency, I believe, multiple users updating or connecting to the same multiple users writing to the same file. That might be a separate task to handle I guess. I think probably I should have checked with you if that is something we need to support, right? Multiple user, multiple document editing? Do you think that that fits in the requirements? Or is that? Is that not valid? Infinite Shadow: So are you suggesting to use noSQL database to store store the files? Quantum Badger: So I was thinking initially, that would work probably better, better when we do not have a problem of multiple users updating a particular document. But since it doesn't it since it doesn't give the ACID properties are like that? Yeah. I think probably this is like a since we will not store it depends on how we store the file. Right? Like, obviously, media files, we will not store directly in the database. So talking about schema here, again, user ID, let's see the files, right. So each file will have a file ID. Who uploaded it? And what's the type of it? And here, let's say you are right, if it's media, then it's better to store in some object storage, such as the s3, and have that link over here. Since media files is not something you would edit, right, you would just upload a new file, if you if you want, you won't go and edit the new existing file. So that makes sense here. But if the file types is something like text file, like doc or something for those, I was thinking, would it make sense to store the contents of the text files as a blob? Or as a as in the database itself? Or should it be stored in separate media like be used for, like we use for audio or videos. And I do not have much knowledge on this. But I think it doesn't make sense to store a text file over some object storage. And probably, it might make sense to, for that to be part of the database column itself. So that any updates can be can be delegated to the database, instead of us handling it directly. Infinite Shadow: The problem with the noSQL data stores many of the multiple data sources. Most of them doesn't have good size limits on the value field. Like let's say I think most of the data stores support like you know, in kbs. So let's say if you want to store a one MB text file, basically we don't have a capability to store that in one row. In noSQL data stores. Quantum Badger: Okay, um, there's a serious constraint on the values and right. Infinite Shadow: Now there are serious concerns on the value size. So the constraint that I think it makes sense to store in the on the data, so which, again, media information. Quantum Badger: Okay. Then in that case, if, if that were the case, then obviously, I cannot think of any analytics being done to see or like any kind of Entity Relationship model here to prefer a relational database over noSQL database here, if you're just storing the media contents directly in the object storage. And so, would it make sense for us to use noSQL database which is, which offers sharding capabilities and replication and failover inbuilt failover scenarios? And since we are talking about five petabytes of data per day. Infinite Shadow: And so, before, we are before going that there, I had a question with respect to... So you talked about having multiple write servers and having a consistent hash string. So how do you find to have constant asking for the write service? Quantum Badger: Let me think through that. So write is when user is trying to upload a particular file. And consistent hashing is a way to decide to which box to send this particular request to? Oh, yeah, I think yeah. I mean, like, quick, but using consistent hashing, we would, we would consistently put this put the same user to the same box, as long as the box stays alive. So the advantage of that is we can cache user related data in that box. And I'm just thinking if, if there is any scope for having those things over there, obviously, I would, I would have general services like, you know, user authentication, authorization and stuff in the load balancer itself. So like, I mean, I don't see us. Infinite Shadow: Yeah, go ahead. Go ahead. Quantum Badger: Yeah, I was like, you know, persistent dashing is probably the reason I chose this, I think probably it might work with any sort of hashing here since at least I'm not able to think of any, any reason why we should delegate, why we should tie up a particular user to one box constantly. So and obviously, consistent hashing will make sure will help will distribute the load evenly, when we have the ability to increase or decrease the write services node like pretty in this cluster, like any nodes, adding or deleting any nodes. So probably consistent I think configuration is it's a good choice, I believe, and that's why I chose that. Infinite Shadow: So what you're suggesting was your your bank to have cache on the host itself on the write services itself and and by using the consistent hashing, you figure out the right side partition with the users data decide? I guess, and and talk to that host is that what you're intending to? Quantum Badger: I think that part I missed since I was like I lost I got lost in deciding the database right? So the next step is like once we have the request for read to write service and write service actually writes to the database. And obviously this has to be sharded because we are talking about a large volume of data on a day to day basis. And to shard, we can use user ID in a hashing algorithm to decide which shard to map to. So all the all the files created by this particular user share store get stored in a particular shard. So I think you covered it. But yeah, that's probably that's, that's one thing I missed. Yeah. Infinite Shadow: There are database shards. And and this is that are read, and write servers. What they do is they talk to a particular party, they talk to a particular database shard with the data recites and they cache the data on the host. Right. Yep. Is that is that okay? Is that okay? So on how would you plan to handle? Like, let's say, if the write server goes down for a given user, how are you planning to handle that case? Quantum Badger: If the write server goes down, for a particular user, then like no... Yeah, like, I mean, if you were caching, if you're using local cache in that particular node, then we would lose that, obviously, because we just lost that particular server, that particular node. And then what happens? If we lose the write server, then we would use a different... Do you mean in the middle of before acknowledging the request in the middle of writing a particular request? Infinite Shadow: Let's say if the request comes to the load balancer, right now, the load balancer has to figure out what what is the write server it has to talk to, because from what you mentioned, the site servers are based on a consistent hash string. So let's say the low band has the logic to figure out the write server and solicit forwards the request, but it found that the server is down. How would you handle that case? Quantum Badger: Right. Like if if a particular node is down, then I believe that's the responsibility of the load balancer to forward that request to some other node which is alive right? Infinite Shadow: Like, are you saying there are multiple write servers per shard? Quantum Badger: So write servers is independent of there is no tie between write servers and shards. Write servers work, like when when write servers gets a request. It uses a hashing algorithm based on the user ID to figure out which shard the file should be stored. Infinite Shadow: Gotcha, gotcha. So all you're saying is the write server is stateless. Basically, the load balancer can forward the write server and write server, we'll have a we have, you know, the logic to decide which database shard the data actually decides? Quantum Badger: Where, yeah, where to store that particular file, file information. Infinite Shadow: Gotcha. Okay, no, what? No, I think I'm always in the understanding. Okay. Quantum Badger: So it's okay. I'm still not clear on the database part. And this is, this is, this is one more thing I would like to be me improve my knowledge on. So like, every use, we need to track the users who the files created by a particular user, right? So if I were to have a table like files, and each each file we generate, they would, we would each each file would have an unique ID, which will be UID it seems and then now... We would tie, which user generated it? And then the type of the file? Probably this could be. And then, since we decided on using an object storage for everything, so what's the URL of this? And probably some other time, like, you know, have created at and last modified at on all those fields. So if you're using these fields, then probably it makes sense to use like, since there is a proper structure we can see. So we can, we can, we can probably use SQL databases such as noSQL. And the volume of like, since since we are talking about how much I was just thinking, did we discuss the volume of the writes we are handling, we just, I think we just spoke about the data and not the number of entries. Like if if a particular user uploads 10 files per day, then we can think how much the volume of this particular database, how would it grow, right? So since we saw a particular proper structure to this table, we can use relational database. And since we're not storing any media here directly, so it's a simple storage. And since this explodes, as we have more users, and have i have i've known theoretically like, you know, relational database, one of the drawbacks of using relational database is it doesn't scale well. And it needs to be it doesn't come out of the box, since it offers things like transactions, and data durability and consistency, like know, if a shard, then it's going to be hard, because of the extra network io, that it has to deal with every transaction query. So like, I'm not able to think of, like, let's, let's, this is, again, a question, right? Like, should we should we use noSQL here? Or should we use the SQL and implement sharding? based on user ID? Infinite Shadow: Right, so I feel like you can use noSQL or MySQL for storing the metadata and, and I think you can achieve whatever functionality you're looking for using both using both the data stores. I feel too personal, I feel like relational database might we might be better for for this use case, because there's a lot of structured data, right that there is file you need to store information about, you need to store first you need to store users, and then for each user, whatever files that are associated, and for each file, you have to store a lot of metadata when data and stuff and you need to store what are the devices that you know, given files have been synced to and you need to store permissions associated for a given file, like who can access these files or not. And, and also some information about the folders that have been synced and stuff, I guess, I guess, either can be used either most equal or my sequel, but I feel like SQL is the it seems very use case for this problem, at least. And, and and yeah, people say it is very hard to you know, scale SQL databases, but I don't completely agree with that. There are like poker news, which are, you know, extensively used to SQL and develop technologies on top of it to scale better and stuff. And my Socratic self comes with a replication and, you know, sharding story, so, yeah, that's my take on it. Quantum Badger: So if I hear you're right, we can, we can use MySQL database, and we can, we can use out of the box replications and we can also do sharding. Right. Is that true? Okay. So if we, if you're, if you want to talk about sharding, what would be the best track strategy to shard here? Should we go with user ID or the region from which the user is using is from, should it be from the device? Infinite Shadow: Yeah, I feel like our user ID, user ID should be a good key to shard on or to constant hash on or to show the database, some that feels like that feels like more. That Yeah, that fits their use case better compared to other fields. Quantum Badger: Okay. Yeah, I think it's probably like 45 minutes already. And I miss a lot of points. Like, you know, like, sharing, sharing is probably just one more... Like, you know, adding a column or something, that little database change. In terms of the implementation, it seems pretty simple, at least for me. Infinite Shadow: I say, if we help, let's, let's spend like two or three minutes on on the client side, because we didn't talk about the client component altogether. So if a user comes up, like a one GB or two GB file, like she drops a folder into the Dropbox for Dropbox client on his on his machine, and it has to be synced, like that work, let's save it as a file. Quantum Badger: Okay, it's a very big file, basically. Okay. Five GB of data. Yeah, I think this is a good point. I think we need to, you know, we need to divide this, we need to break this particular file into multiple smaller sections. And, like, the way HDFS stores files, right, we need to break into smaller parts, and allocate these parts. So that's the high level thing I'm considering. So like, to be under percent specific, to go with the details, hits the load balancer with that put that much data, we are still carrying that much data over here. And we would forward that request. We will forward that request to write. And in the write it will divide this data into smaller chunks or smaller blocks. And now it doesn't make sense because if you upload this five GB of data directly in an object storage such as s3, it does all these things by itself. So like we are carrying that five GB data, till the write service? I don't I'm not sure if we have to do anything at our side. To break this, break this extra big this large file before storing it in an object storage such as s3. So, like, I'm all ears, I'm not sure how to deal with this. Infinite Shadow: Yeah, so my expectation was, let's say there is a five GB file that is being uploaded on the coin side. So what I'm expecting is there can be like a component on the client can call it a chunker, or file divider or something, which basically divides that file size in size bytes, and basically uploads which word and stores in a chunk that information like the index information for each chunk, and then basically it consult that file if whenever the user accesses that file. So I think I think it makes sense to have that component on the client side for multiple reasons. One being, you don't have to convert, you don't have to, you know, wait to upload that much data. And let's say if there is a intermittent connectivity issue, then you lose, we have to start uploading everything from scratch. So we were trying to multiple help you deal with multiple things you can do all in parallel, you upload all the data, and you can parallely divide the data into chunks and and yeah, and it's easier to store on on the database side as well. I mean, you can use s3 and other stuff to store large files, I guess MongoDB and stuff, but I feel like Having that component on the client side makes sense? Quantum Badger: So we would divide this large file on the client side, and then have the client side, update our upload each block parallely. And so that is like how, how do we? How are we ensuring that all these blocks belongs to a particular file? How is that information? Where is that information stored? In that case? How is that propagated, right? Like? How does it flow through the system? Infinite Shadow: Wait, right. So, yeah, whenever, whenever you make request a pull request for the channel, then you need to provide the file information and the user information. And the words information that it's coming from. Like, there is like let's say file a dot txt, right, which is like one gig. So and the U one is user ABC is uploading then you provide ABCD is also the child's name and the size, something like that. Quantum Badger: So user a uploads five GB of data, right? And client basically divides this into probably 500 Mb blocks. Right? So what I was asking is, let's say this is eight or something, right? Like mp3 or something. So my question was, like, how do we associate this 500 Mb blocks to this particular big file? How do we store that? And if we do, then how are we storing that? That was my question in the back end, like how do we, when we when user asked for this file, again, we need to construct, we need to fetch all these things, and send it back to the client so that the client can construct it rather than us. So right? Infinite Shadow: Now, your question is, like, okay, so basically, when you store the information, you need to store a bunch of metadata associated with these chunks, like you can store sequence numbers right in, you can have an auto increment for each of the chunk or you can have sequence number, which is strictly increasing. Or you can have like a manifest file. For each file, you can have a manifest file, which actually stores the source information of the information of each chunk in kind of a you know, in an orderly fashion that way, you know, the first the first element in the manifest will basically points to the first chunk and the second element points to the information for the second chunk and so on. Quantum Badger: Okay. Oh, yeah, I think probably I can probably figure out this manifest file, right. Infinite Shadow: And, and let's say each chunk size is like 500 KB, right. So, this will be like, you know, chunk one and find a KB and location of the chunk. Similarly, chunk to find a KB or whatever it can be 400 kB as well what let's say 400 kB and then location of the chunk and so on. This may not be strictly increasing, right. Quantum Badger: What is location mean here? Infinite Shadow: Location of the chunk in the in the so that we are having two data stores. One is the metadata data store, which actually stores information about the user's permissions files, and another is actually the object. Right, okay. This could be the location to the object store. Quantum Badger: Okay. And we also need to relate this as well. Infinite Shadow: Like one to 10 cents per person. And then two conditions or something like this. Okay. Quantum Badger: Sounds good. Yeah, so, like, no, what do you think? Where do you think... Obviously, I don't go as structured as I'd go with the coding for, like, you know, for lack of practice or not sure, but fake. What do you think? What's your feedback for me? Infinite Shadow: Yeah, let's go to feedback since we have five minutes left, and yeah, I think we're all the way the interview went, I am I'm inclined to hire I think you do understand the system concepts and and basically, designing scalable and available micro services. Yeah, coming back to the improvement section, I guess. We love I feel like there was some back and forth on the on the write servers and on the another database servers and the sharding. Initially, I thought you mentioned about write being on a consistent hash string, maybe I misunderstood or something from my understanding, initially, load balancers and node balances as the logic to to figure out what is the right write are associated with that user, or out of use user in the file combination. And it will basically route the request to that, that set of force where, where the data is present, like, low, and if needed, and push that information from the database shard, and mirror some back and forth either of the work, I mean, having having like a form, or picture like board, there can be multiple ports, right? Each pod contains a range of users data and, and although all the write servers in that pod, are behind one load balancer for availability and redundancy, and the load balancer can just figure out what data resets and it can write the request, it can route the request to that write port. And then that port figure or the data is available locally, or if not, if it goes to the write database shard? Either way it works, I guess, but there are some back and forth on that. And, yeah, other than that, yeah, regarding the database design. So I was expecting you to come with the metadata, data store, and then object data store out of the box, because those two fall into different buckets. One is like, the metadata associated with your entire system, and the actual files, which can be really large. So since these are like, files and stuff, it really needs it fits to the description of the data store, I think you came up with object data store, but you didn't really mentioned only like, large very large files, we are going to store on logical data store. Whereas other varieties of bands store on a store. Right. So that was that that was one minor feedback on the database design side. And you did talk about sharding. And you do definitely understand how sharding works and how the consistent hashing works. On the end, and having multiple load balancers for redundancy and stuff. Yeah, and regarding I was expecting to spend more time on the client component, which we couldn't access, there are a bunch of interesting things to talk on the client component, especially how does the synchronization across multiple clients like let's say, the same user is a pulling data from from like, clients, or, or even, like, even lose, not uploading right away. But if he's uploading data at one client, how does that data go in sync on other client when the user logs into another client? I think there are like right present challenges, which, which can be talked about, especially the architecture that you want to use for replicating the data to other devices, like using message queues, of models, stuff that would be interesting to talk about I was I was thinking of spending some time on that, but we couldn't get time. And even on the client side, there could be a bunch of other components like there could be a component which will be listening for changes on the client. And whenever user drops in a folder, this component is responsible for you know, listening to us, and and yeah, and doing the chunking component or whatever and interacting with the with the synchronization service or the load balancer, to write to the databases and stuff, I think. Yeah, one feedback I heard was, yeah, we could have covered a lot of lot of stuff. I think the main core of the design, especially the time component and the synchronization component, I would have been more happy for you for covered. So look out for time management. And so if you're come out out of the box, award, the metadata design and the data store design and how the write servers interact with the database design in one shot, then I guess we would have saved some time. So, that is that is another feedback I had. Quantum Badger: Okay. And, like what would you suggest? Ideally? The flow right, like? Infinite Shadow: Yeah. You mean, just you mentioned like, in some interviews, you followed the flow before and it didn't work out? Well. I'm not sure what, what the reasons for that, but I, I'm, I actually like the candy shop, pose a problem in in a, in a, you know, in a structured approach, because that's how you write your design docs in, right, right, like in your company or any other company basically start with requirements. And they start with non requirements, capacity, and in and you go into high level design and detail level design. I really like that approach. But if someone doesn't like it, I don't know, what could be the reason. Quantum Badger: Now, the problem with that is, that is the same thing which happened now, right? Like, obviously, like, you know, you were you, you're interested in more with the client side. And if I, if I just strictly follow this, like, probably, I think I maybe like, this part is unnecessary, are maybe this part is unnecessary. Like, you know, because there is only so much time. What I meant is like, instead of strictly following that means like after having established use cases and stuff. Like, you know, understanding what the interviewer wants to focus on, and then delving into that, rather than going through, you know, bandwidth estimates, and database design and high level design, and then delving it probably like, you know, maybe I should be very fast. But that's one of the reasons I thought it didn't work out. For me. Just so probably being a little slow. Infinite Shadow: Yeah, I understand your question. So basically, if you follow the structured approach, you might not have enough time to cover the comprehensive that the interviewer would like to cover. Or you might lose some time discussing some components. Right, right. So I think the way for that is like for this interview, I think you initially took 15 minutes, I guess, 15 minutes to finish the capacity. Until the capacity section, I think, usually, I would expect, I would reduce the time to maybe eight to 10 minutes. So in eight to 10 minutes is where we'll figure out requirements on requirements and update, and some part of the capacity estimations. And after eight to 10 minutes, you should be able to ready to, you know, go to next section, which can be which is relevant, like API design, or whatever it may be like a level design or a database design. So in this case, I specifically that's where I want you to go with API design or even right the reason because I was not interested in those I was mainly investor on these components. Usually I said the expectations to the candidate that, hey, I'm interested in these in these components. But after your high level design, when you're going into detail. I was I was thinking of mentioning that but as I said, we lost some time talking about the database design and also prior service. Yeah, otherwise, I would have definitely asked you to move on to a different different component for sure. That is one thing you can't control. The interviewer is not putting this foundation, hey, you can say that, hey, I'm going to start with a high level design. Let me know if you're interested in talking about one component in depth or otherwise, I'll go each component by component, you can set that expectations in the forehand. Quantum Badger: Okay. And regarding database design, right, like logically should this come before delving into higher level design? Probably that could have helped me decide if I'd come up with this before that would have helped me decide the choice of database and or at least the way I think come with what before so if if we had if I had come up with the database design right before going into high level design, it would have helped me design. Infinite Shadow: When you say database design is talking about or is identifying the schema right database. Quantum Badger: Database design in the sense go ahead and define entities and the schemas and you know, all the metadata related information. Yeah. Whatever we have written, okay. Infinite Shadow: Yeah, to be frank, I'm not I'm not that for this question. At least I am not that curious about how the other schema looks like. Rather, I was I will be more interested in a data store you want to use, like for storing the metadata and also for storing objects? And the way will you to learn all the other? I'm really interested in that component and how we want to ensure that, you know, the database meets the group demands, like do you want to shard the database? If so, if so, what kind of sharding algorithm we want to use, unless if there are any hard partitions in the data store where you're receiving a lot of traffic, and it is affecting the customers on that shard? Like, how are you going to identify those hot partitions? And, you know, I'm really interested in those things. Quantum Badger: Okay. So you will be okay with a candidate doesn't describe the schema and just delve into the Big Data pictures, like, you know, choice of database, and how do we shard and scale on all those things? Right? Infinite Shadow: Especially for this problem, I'm in for other problems, which may vary, or this problem, the reason being, as there are like, other interesting components to talk about, right, as I mentioned, right, the client. So I wouldn't have a candidate spend time on, you know, writing the schema and stuff. I mean, so, for this problem, at least I'm not in that. Quantum Badger: Okay. Yeah, that makes sense. One final question is, like, you know, you mentioned about the, like, if multiple users are using the same document or editing the same document, and if you're using if you're storing that document in an object storage? How do we ensure the data durability? Or how do we ensure the consistency between like all the users data is shown to others and everything which the user has written is stored in the database? Yeah, that's, a tricky component to work with, if multiple users are trying to, you know, make changes to the same file, right? Then you need to have some kind of locking mechanism, like you need to lock components of that file. And you need to have like, yeah, you need to have a skill, that locking strategy. And, and and also some kind of conflict resolution most probably should happen on the server side, I guess. Because it is coming from multiple clients. Yeah. Infinite Shadow: But like, would you expect that to be part of the interview? Quantum Badger: I feel for this question. I'm not too much curious about the, you know, online editing aspect of it. I was just interested in like, how does changes propagate from one client to another client, rather than like, if the clients are more trying to modify the same file? How does that work? I think that would be a better use case for, like, let's say, for a Google Doc, like, let's say, If multiple persons are editing, same document, how does that work? I think that, that will that problem better rather than this one? Infinite Shadow: Okay. And if if you had that use case in mind, and if candidate doesn't, while brainstorming, he doesn't bring up that particular feature to, like, full would you put that to the candidate to handle it? Or is or could you just let it go? Quantum Badger: You mean, if the actual use cases like support online editing by multiple clients? And if the candidates aren't able to figure that our. Right. I didn't put that in the use cases. Right. But that's, that's a worry. Infinite Shadow: Yeah, that's a common use case. But actually, I was not too concerned about that use case. If, if I felt that that was applicable for our design, I would have definitely mentioned you, hey, you're missing this requirement. Okay. So for this problem, I felt like that was not like a really important use case to talk about maybe if we had time. We could we could have in that time, but that's definitely an extension to this question. But yeah, I didn't feel that was an important. That's why I didn't I didn't correct you in the beginning. Quantum Badger: Okay. And Is that considered something like? Like, Is that considered negative point of if the candidate doesn't come up with good or standard use cases for a particular domain? Infinite Shadow: I wouldn't say so. Unless until if there is like a basic requirement, right, like for this problem listed, the candidate doesn't come up with upload and download feature. And then that, that I would consider something, you know, too obvious to miss. But offline editing, or online editing, I wouldn't consider that to be problems. The candidate doesn't come up with those use cases. Quantum Badger: Okay. Yeah. And like, you know, please feel free to I'm not sure if this comes with, does it come with verbal feedback as well? Or, sorry written feedback as well? Yeah. Yeah, like, Okay. Yeah, just feel free to put, anything I missed. In terms of like, you know, if this was a real interview, like, you know, what, how I would have felt and what are things miss? Obviously, you spoke a lot. But if you think you've ever missed something else, then please feel free to put it there. Yeah. This is one one area was working very hard on. So that would be really helpful, too. Infinite Shadow: So what I would do is I will try to take some open end problems and try to think on your own and writing down how you're going to solve it. Okay, and, and, and I would actually read, read about that problem, like, let's say, if a company is solving the problem, or through their blogs, or you can use standard interview platforms for questions and try to compare your answer and see where can be the improvements. Before doing that I would actually read about all the all the concepts associated with the distributions, like how does the CDMs work, how does consistent hashing work? Different types of cap, different types of caches, and WebSockets, HTTP connection, TCP connections and all this will cover all these concepts. And then try to solve the problem, write it down and look up the answer and see where where can improvements be. Quantum Badger: Okay. Yeah, that sounds good. Infinite Shadow: It was nice talking to you and all the best on your interviews. Quantum Badger: Thank you so much. Thank you so much for your time. Yeah, good night. Yeah. Bye.
Want to get some practice yourself?
Become awesome at interviewing, and get actionable feedback from engineers at top companies – it’s 100% anonymous!