Intergalactic Avenger: Hello.
Aerodynamic Crocodile: Hey.
Intergalactic Avenger: Hey, how 's it going?
Aerodynamic Crocodile: Pretty good, how about you?
Intergalactic Avenger: Doing good, doing good. I think I 'm just going to jump right into it, if it 's all right. So this is going to be a kind of a data structures question. There 'll be a little bit of.... sort of talking about like asymptotic performance, and then a little bit of coding at the end, just to give you a little bit of overview. So basic idea here is that we 're trying to implement a list and we 're going to try to see what the trade-offs are for the different data structures that we could use for that list. Does that make sense? So um, can you see me typing on...? So basically, um I guess I can just get rid of this, so we 're trying to implement the list that 's going to... right now we 're just going to have two different operations we 're going to have on it. We 're going to have then insert and delete, so they 'll be like index and the value.... let 's just say it 's a list of integers. So we 'll have two operations and inserts, get at an index and a delete at an index. So the two possible data structures are an array and a linked list. So, can you describe for me what those operations are going to look like for those two data structures?
Aerodynamic Crocodile: Sure um, so insert for an array, a list that 's backed by an array, could be an expensive operation. For example, um usually when a list is backed by an array, when you hit the end of an array, you don 't have any more space. You 'd need to reallocate array from somewhere. Typically, that was going to a malloc or realloc or something which could be potentially expensive. On top of... I mean even if it 's on topic, once you have a new memory segment you have to copy the whole list back into the... sorry the old contents back into the new block of memory that you allocated. So basically you 're looking at the, I guess asymptotically of either the max of O(N) or the maximum of however long it takes to allocate a new block of memory, assuming allocating a new block of memory is almost nothing, let 's assume it 's O(log N) or something like that. Basically at the very best, your insert will be O(N), just because you need a mem copy the whole thing into your new array, and inserting it...
Intergalactic Avenger: What about... what about what you do with all the elements in this array? So I have an array of integers like 1 5 7 9 and then I say, inserts at index two the number 8. So the resulting array is going to be 1 5 8 7 9.
Aerodynamic Crocodile: Right yeah. So this is basically a second... you need to move everything down so you could probably optimize the mem copy to do essentially do a single pass copy instead of a double pass copy. Let 's say your memory segments... Let 's say you have an initial memory segment of exactly 1 5 7 9 2 5 and you need to insert something. The first then you would allocate a bigger chunk of memory let 's say that 's this bigger. This would be your first pass of mem copy. Then your second pass of mem copy would be to move, shift everything down by 1 which is again sort of annoying. So you could possibly optimize um... so the naive way would be something like this, which would be two mem copies. You could probably optimize this thing so that you 'll need to one mem copy and try to just avoid the second one, which should be fairly straightforward I think. You just mem copy everything until index minus one, then insert your thing, then just mem copy everything index plus one to end of the old list.
Intergalactic Avenger: Ok, all right.
Aerodynamic Crocodile: But um things like this I 'm sort of assuming a really dumb method of allocating your memory or so pre allocating a memory, um usually what you can do is that if you anticipate that you are going to be inserting a lot of elements or you have a rough idea of how big your array is going to be, um you could pre-allocate your array to be. So essentially your whenever you have to insert another thing you already have the pre allocated memory so you can just shift everything down and just put in whatever you want back here so you reduce the cost of allocating a new block of memory um so yeah.
Intergalactic Avenger: Ok make sense. And what was the runtime of that again?
Aerodynamic Crocodile: Um the on average case I mean if you could say O(N/2), but um just because you could if you 're looking at some sort of random access inserts, but realistically it 's going to be a worst case, which is what O(N) is really, um so that would be O(N).
Intergalactic Avenger: Okay, sounds good.
Aerodynamic Crocodile: Okay, so deleting stuff are generally the same idea. Deleting is fairly straight forward, everything let 's say you want to delete 8, really what it 's doing is just just moving everything down here that 's about it um. So it 's shifting up by one um again which is another mem copy, nothing more than one copy. Next part is that you don 't need to really reallocate your array or anything weird like that, um you pretty much I mean you can just leave unused memory. Of course you take a memory hit, but um depends on heuristics. If you 're going to be really dumb about it, you can reallocate to a smaller chunk, which all right based on use case that may be okay to do, but for the most part, nobody would do that, you just leave it.
Intergalactic Avenger: Okay.
Aerodynamic Crocodile: So um linked lists. Um inserting, uh it 's not as bad I would say. Um the difference is that in a linked list, you had a... looking up an index in a linked list not very easy. Um so you 'd have to traverse the linked list to actually jump to the point that you want to insert it, but inserting itself is quite easy. Um you can just just take the previous pointer, assign it to the new of the new element, you take the new elements next point is assigned to the one that was there before. So in that sense, it 's fairly straightforward. But again, uh traversing to the linked list is not... it 's terrible in a lot of sense that it 's going to wreck your cache, just because uh if your each of the nodes, god knows where they 're in memory, each time you jump around the nodes, you might be evicting pages stored in the cache, and you may be just destroying performance. But if you have a rough idea of what the index is going to be, for example if we say the index is always going to be zero or index always going to be the last, basically what you 're looking at is there 's a queue of stacks, um which is basically it 's not really random access, it 's a fixed access. In those cases, you take away the traversal, so inserting is almost zero cost, it 's O(1) basically. Um and in that case, even deletion would be O(1) because the cost is traversing through the linked list, so it 'll be that, yeah.
Intergalactic Avenger: Okay so, but if we focus back on the case of the random inserts and deletes, what are the asymptotic runtimes of those?
Aerodynamic Crocodile: Um for the most part, worst case scenario is if you want to add to the end of the list, it 's going to be O(N) to totally delete, but deleting yeah if you want to delete the end of the list and you only have a point to the top of the list, that would again be O(N), you jump all the way to the end of the list, delete it. So traversing is always going to be O(N).
Intergalactic Avenger: Okay, all right, excellent. Okay so, if you can switch over to the whiteboard, I 'm going to draw the graphs of those two, as you had described them. So can you see the graphs that I 'm drawing here? All right, so as you mentioned... so let 's just start with random inserts, and what I 'm going to put here. So on the vertical axis I 'll put time and on the horizontal axis, I 'll put length of the list. And as you mentioned, they are... well actually let 's just say can you draw for me what the, what the curve is going to look like for random inserts of say the linked list or the array, the array backed version of the...
Aerodynamic Crocodile: Sure sure um... Random inserts, I guess random inserts is...
Intergalactic Avenger: Well so, yeah I don 't even have any... like so just just a basic shape of the curve will be...
Aerodynamic Crocodile: Yeah it 's got to be a line, that 's supposed to be a line by the way. How to draw a line on a trackpad, right?
Intergalactic Avenger: Yeah exactly. So, just to make sure, it looks like it 's flattens out here?
Aerodynamic Crocodile: No no, that 's straight. How do I erase this thing?
Intergalactic Avenger: Yeah okay I got you I got you. Okay so um good enough. Okay so that 's the line for the array backed one. Now can you guess if you were to do a general just regular normal linked list, just like you described it, where you allocate new nodes for each thing blah blah blah. What the line would look like for the linked list in comparison to this one.
Aerodynamic Crocodile: It could be exactly the same, um the only thing, I mean like I said right, for the array backed, linked list a lot of it depends on how quickly you can allocate a new block of memory if you need it. So I mean maybe the line I drew there is wrong, because I 'm assuming that basically takes no time to allocating a new chunk of memory for array backed. Um well as for what for a node in a linked list yeah, um you don 't really need to allocate a new chunk of memory every time um, so it would be more like it would be more of a straight line just because you need it, hey that 's a better straight line, anyway so it 'd be more of a straighter line, um I would say that for an array backed linked list, instead of being a straight line, you might see an occasional jump like this or something like that. I don 't know how to describe it.
Intergalactic Avenger: Okay but relative to that line, where would you put the, where would you put the line for the array, sorry for the linked list. So if the blue color is the...
Aerodynamic Crocodile: It would be far straighter um... uh sorry, uh this is uh yeah again the mem copy is gone, you 're just traversing, so really I mean average case is just inserting in the middle of the list I would say, um so it should be the same, I 'm not seeing the difference, so yeah...
Intergalactic Avenger: So what if I give you a hint start it actually, and you already mentioned this before, but if I give you a hint that it actually looks like this. It actually will look like this. So you 've drawn yours with like a certain slope, and mine has a different slope. So can you think of why that might be? And actually you 've already alluded to it a little bit before. So I think you do have the the idea. So think about all the different operations and what actually takes up the time?
Aerodynamic Crocodile: So basically you would say that an array backed list, let 's say allocation, takes O(N), um and you have this thing the mem copy, takes O(N) and that 's basically O(2N), whereas in the node, its just traversal with this n, so it 's actually comparing two n verses n yeah.
Intergalactic Avenger: Well but wait, you said the two n for the... wait oh, the two n was the array and then the n was the linked list.
Aerodynamic Crocodile: Yes.
Intergalactic Avenger: But I 've drawn it the opposite way. So the red, so the...
Aerodynamic Crocodile: See if it 's actually the cache operations or things like that um, I wouldn 't... I don 't know how to quantify that in terms of n.
Intergalactic Avenger: No need to quantify it, but just in general just the ideas of like what... So the bluer one is the, is the array. So just as a... so you actually you had the answer right previously, it 's that when you traverse the list to go and find the specific element, you are going all over the cache, and you and every time you bust the cache, that means you know the whole thing gets reread, the current cache gets completely thrown away inside the CPU, and the new cache gets completely read from from regular RAM, and then you find that one element out of it, and then when you have to find the next element, you have to throw all of that away, read all the new data back in. So actually, it 's quite expensive to do this.
Aerodynamic Crocodile: Uh yeah, that makes sense, but I mean it 's sort of hard to say whether mem copy is worse or screwing up your cache is worse. I don 't know how to write it in terms of O(N), because...
Intergalactic Avenger: Well but if you think about it, if the mem copy is happening all within the cache, then it 's going to be happening very quickly, and if you think about, you know you 're gonna have n operations that are inside your cache versus n operations outside your cache. So the outside the cache ones are going to be significantly slower.
Aerodynamic Crocodile: But at the same time I could counter saying that let 's say all nodes are well behaved and they 're not sparse, let 's say they 're in similar memory regions, so in those cases just because you 're jumping across nodes doesn 't necessarily mean it 's going to ruin your cache. It may be well the case that what are you 're accessing happens to be within the cache. I don 't think that 's... depends. I mean a lot of thing like these are things depend on the data a lot more so.
Intergalactic Avenger: Yeah no, that 's certainly true, but actually I have one counterexample. So let 's say that you have a relatively limited cache, and you 're perfectly right that a lot of this that depends on the data, you know, there 's a lot of things that are happening in here and and you know these lines are rarely like exactly you know one N or two N. But I do have one counterpoint to the question of you know let 's say they 're well-behaved and they 're all sort of tightly bound within one another, can you think of a reason why even if they 're tightly bound and they 're all lined up next to each other, why the linked list is still going to be worse than the array in that case?
Aerodynamic Crocodile: Oh if they 're tightly bound together...
Intergalactic Avenger: So we 're talking you know the effects show up when the lists start to get pretty large, so we can assume for example that the list is going to be bigger than one unit of the cache. At some point, you 're going to have to no matter what you do you 're going to have to be swapping stuff in and out of the cache. But is there, is there a reason that you can think of why the linked list is going to, even in that case, have worse cache behavior than the array even if they 're all lined up next to each other?
Aerodynamic Crocodile: Um I would say... Only thing I can think of is how it 's compiled in. You may need to fetch a pointer, I mean essentially these things are implemented pointers so, you would need to load the next pointer and follow the pointer, maybe take three instructions per access, where as an array, how was it implemented? Well arrays are just glorified pointers, so...
Intergalactic Avenger: Let 's go back to the pointers, where are the pointers stored?
Aerodynamic Crocodile: Oh... well the pointers are what are going to be in the cache, but uh...
Intergalactic Avenger: So in the,so you 're right. The pointer is in the cache, as are the data, for the linked list. But what 's in, what 's in there for the array?
Aerodynamic Crocodile: Uh well I guess yeah, for the array it 's the raw data, whereas for the linked list that 's just pointers. Um that 's a huge distinction. Um, but moving pointers is as good as moving data itself, because you don 't really care what it points to, all they 're doing is just adjusting the next pointer so...
Intergalactic Avenger: So let 's make it a little bit more concrete, let 's say that you have a tiny cache, and it has... it has 12.. just say it has 12 bytes in your cache. And let 's also say you 're dealing with a 32-bit machine using 32-bit pointers. How many elements of your array can you fit in there versus how many elements of your linked listed can you fit?
Aerodynamic Crocodile: How big did you say the cache was?
Intergalactic Avenger: 12 bytes.
Aerodynamic Crocodile: The cache was 12 bytes, okay.
Intergalactic Avenger: Okay so you have 12 bytes in your cache.
Aerodynamic Crocodile: Yeah.
Intergalactic Avenger: How many elements of the array can you fit in there versus how many elements of your linked list can you fit in there?
Aerodynamic Crocodile: Assuming each, I guess you 're talking about integers, so you were going to be able to fit one integer within the cache right? Because there is a 32 bit int, okay yeah so three elements... okay yeah I should be able to fit in three integers whereas because it 's a 32 and it 's a 12 byte cache, and it 's a 32 bit machine, uh similarly I would be able to fit in three pointers but that three pointers is as good as basically one-and-a-half elements because I would have the previous as well as the next point, so in terms of elements I guess I would have three elements for an array but only one and a half elements I guess, depending on what 's in the struct for linked list. So I 'll be more likely to keep missing the cache I guess.
Intergalactic Avenger: Yeah, as you 're going through the array, the space taken up is less than in the linked list case because of all these pointers.
Aerodynamic Crocodile: It takes 32 bit to represent an int.
Intergalactic Avenger: Okay excellent. So what this all brings us to, and as you pointed out before, is that you know at the very beginning if we go back to the the text pane, you show that... you said that it was critical that you allocate the memory in the right way. That you said that there was a sort of naive way of allocating memory where you said for example if you started well with the 1 5 7 9 2 5 and then to insert, you had to malloc a new array that was one bigger, and then add in the 8 and then shift everything over. That 's sort of the most naive way of doing it, and so let 's just explore this and see if we can improve upon that a little bit. So as you described it, so in these three steps on lines 11 to 13, so this is sort of the simplest most naive way of doing it, how long is going to take asymptotically to build up an array of size N, and let 's even make it a little bit easier and say we 're always going to be inserting at the end.
Aerodynamic Crocodile: Um I guess to fill up an array of size N and you 're just filling up at the end, it is going to take... you 're assuming the memories pre-allocated or no?
Intergalactic Avenger: No, so basically we 're going to use exactly the type of memory allocation scheme that you described here, so you 're adding, so actually in the case here it would be more like this, and you need to end up with that at the end, so you it would look like this. So at the beginning of the step, you have a completely full array, you malloc a new one of size one greater, you copy everything over, and then add the 8 to the end. So building up your list, your array that way, how long is it going to take to fill up an array of size n.
Aerodynamic Crocodile: It is going to take basically the worst the runtime of the malloc and multiplied by n basically, assuming the runtime of the malloc is 1 or some magical allocator.
Intergalactic Avenger: Let 's say that it 's constant.
Aerodynamic Crocodile: In that case it is going to be n, yeah.
Intergalactic Avenger: Allocating is constant, but to fill up in a whole array of size n by by saying so for example, I mean to make the thing we have there it be insert 0 1, insert 1 5, insert 2 7, so you see, so basically just calling insert multiple times.
Aerodynamic Crocodile: This is I 'm saying O(N), so if you want to just include all of these I guess, I don 't know how many of these you wanna give, essentially uh..
Intergalactic Avenger: Well so it 's more like yeah okay...
Aerodynamic Crocodile: So yeah if you end up with m elements, um so it 's gonna be essentially m*n where m is the number of elements you want to add and n is equal to how long insert itself takes.
Intergalactic Avenger: Um okay, so in our case where we 've described the process, what is that going to... so and n we have an idea ofm so we 're trying to build up a list of length m and what is n going to be as far as this algorithm is concerned.
Aerodynamic Crocodile: Um n is the mem copy that 's required here, um so this is basically nothing. Um to get this, um the allocation is of course free for us, um to get to this state um, I guess to do the mem copy into a new block, are we assuming that the allocator is able to just extend the block or...?
Intergalactic Avenger: No you can 't extend the block.
Aerodynamic Crocodile: Oh okay yeah, if it 's not that magical, the mem copy up to here would be n basically.
Intergalactic Avenger: Okay.
Aerodynamic Crocodile: Um so and the number of elements, then adding here of course this step itself is O(1), and this step is O(N).
Intergalactic Avenger: Okay, all right. So then that sounds good. Now if I 'm building up, so I 'm just looking at this this line on line 20. So n is actually fixed at any one of these steps right? So this is O(1), and this is O(2) and this is O(3) etc. So then because we know how long the list is at that point, so what does this... can you rephrase line 20 in terms of just m.
Aerodynamic Crocodile: Yes, uh in terms of just m.
Intergalactic Avenger: Because m and n are related.
Aerodynamic Crocodile: Yeah m is n basically.
Intergalactic Avenger: Right.
Aerodynamic Crocodile: M equals n in this case I guess, or O(N^2).
Intergalactic Avenger: Yes okay perfect. All right, so we have now an O(N^2) algorithm for doing all these inserts, which means it certainly doesn 't look like what we drew on the whiteboard, which was a like a line. So can you think of any way to get this timing down to turn that into a line, as opposed to it being a polynomial curve.
Aerodynamic Crocodile: Um just reversing it, um...
Intergalactic Avenger: So well so like I guess I mean so we described a certain methodology for doing these inserts and it involved allocating something new that was one greater, copying all the elements over, and then insert.
Aerodynamic Crocodile: Exactly yeah right.
Intergalactic Avenger: So is there a different way... so that 's a simple way, and it will work, but it you know it 's a bit slow, it 's n squared, polynomial time. So can we think of a way to reduce that a bit so that we can guarantee that we 're going to have a straight line curve as opposed to a polynomial curve when these things get large?
Aerodynamic Crocodile: Sure you optimize the insert right?
Intergalactic Avenger: Right yeah, so just thinking about the insert case, and just the insert at the end so we don 't have to worry about the other cases for now.
Aerodynamic Crocodile: Insert at the end... basically what you 're looking at is a stack of some sort or a queue of some sort, so how do you implement that as an array backed list.
Intergalactic Avenger: Sort of... I guess well all I 'm really thinking about is like like steps 11 through 13 describe an algorithm for how to insert new elements and as we 've described it, it has O(N^2) run time to build up a list of length N.
Aerodynamic Crocodile: I 'm just curious, are you limiting yourself to an array backed list or I mean...
Intergalactic Avenger: Yeah, I 'm just thinking about arrays in this case. Looking at the other graphs, we like the graph that the array has as opposed to the linked list, but the only trick is that we have to be careful with how we do the memory allocation as you as you pointed out. So can we get, can we go down to the specifics of what such a memory allocation scheme might look like to guarantee that we were doing a better job.
Aerodynamic Crocodile: Um so we can sort of look at like some sort of an exponential growth of the size of the memory that we allocate, um for example right um let 's pretend that we have an empty array of when we start on essentially um, and essentially one is being inserted now right? So basically we have no spot for one, um let 's assume the simple case we just allocate one. And again you 're inserting five now, let 's just assume that 's also a simple allocation. Now when we insert the seven, it 's just allocating, this is basically size. I guess this is size 1, size two. Um so when we have to insert a 7, instead of just allocating that 5 7 the naive way and sticking as three, we can... I mean since we know that at some point in time something new is going to be added onto to the end, we can just say this is fine, but we 're gonna allocate size four instead. Um so whenever you insert nine next, so you 're going to have 1 5 7, you already have space for the nine, so you can just insert it as this and this is as good as O(1) basically, since there 's no allocation or mem copy that occurs. So you keep it size four. So eventually let 's say you insert whatever numbers next, um let 's say which is two or something like that, um you would do 5 7 9 2, but you would increase it to size eight, so how many do I have? So the next four allocations basically, or sorry next three allocations, um are going to be O(1).
Intergalactic Avenger: Okay so perfect, I like that very much. Now can you tell me what the runtime is to fill up a, like in asymptotic notation, what is the runtime then to fill up an array of size N by doing successive inserts at the tail like this?
Aerodynamic Crocodile: Right so um, the only time I 'm allocating and mem copying basically um is every power of two, so basically these points are effectively log(n) apart I guess? I think that log(n) apart, every n times you take the log n point, how do you describe that?
Intergalactic Avenger: Another way to think about it is so, in getting up to size of eight, you had to allocate how many different times?
Aerodynamic Crocodile: Oh I would have had to allocate... oh that 's one, two, three times. Three times. So for size... oh yeah that is a good to think about it. So yeah, for size n, I would have had to allocate log2(n) times.
Intergalactic Avenger: Okay and each time you allocated...
Aerodynamic Crocodile: I would have a hit of n, so I have a penalty of n every log n times... What is that... I don 't know how to think about that. I might be going to probability but yes exactly.
Intergalactic Avenger: Sometimes it 's helpful with this to think of... to just sort of count it up. So like, if you 're building up a list of size 8, how many operations is that going to take?
Aerodynamic Crocodile: Uh it is a...
Intergalactic Avenger: You can just count it up using these like these examples right?
Aerodynamic Crocodile: Right um so for size eight, basically here I took uh... this is... okay so this took one operation basically, which is the allocation and copy. This again took two operations, so in total we are at three right now. So this took only what... three operations really? Three operations. Yeah three operations, but I had to take three more operations earlier on, so I guess this is six... this was zero basically, zero operations but I could come to up to this point, I had six. This one again I had to do four operations. So 4 plus the previous number is 10. Uh what 's the pattern here?
Intergalactic Avenger: Yeah... If you to think of... This is a funny way of thinking of it.
Aerodynamic Crocodile: Perhaps the wrong, I guess that 's why I 'm stuck.
Intergalactic Avenger: Right so let 's see. If we think of it as the first one is the incremental steps and the total steps. So there is something that 's a little bit confusing about this, is that so you have the... so the allocation here, so the first one is fine, it 's one step. The second one, the allocation is free, the copy is one, and then you added one.
Aerodynamic Crocodile: Yeah so this should be... yeah so that allocation was free, the mem copy was one, oh so I guess...
Intergalactic Avenger: Right so then that 's fine, so for the next one, the allocation was free, copy was two, you added one. So it 's still yeah... so we 're doing okay here. And so then the total is six. Then for this next step, there 's actually only one right?
Aerodynamic Crocodile: Uh yeah.
Intergalactic Avenger: Because you 've already pre allocated the memory.
Aerodynamic Crocodile: Right right right, so that takes us to 7.
Intergalactic Avenger: So at this stage, again the allocation is free, the copy took 4, and then you added the 1 because you 're copying the top 4, and you add one.
Aerodynamic Crocodile: That takes us down to...
Intergalactic Avenger: So that 's 7 and this is 12... and then 1 5 7 9 2.
Aerodynamic Crocodile: Going to be 13 basically, the allocation is always insertion.
Intergalactic Avenger: So then this is just one and...
Aerodynamic Crocodile: It 's going to be one for a while until the exhaustion the allocation.
Intergalactic Avenger: Right yeah, and then right. Let 's see, are we up to 8 yet? So now does the pattern... and then the next one bla bla bla bla is going to be 8 plus 1. So now maybe it 's started... you can start to see the pattern a little bit better.
Aerodynamic Crocodile: Oh yeah, so I guess the keys to focus on this.
Intergalactic Avenger: So actually this was this was a little bit weird. Okay that kind of starts to... I just changed line 22 to sort of show more out the pattern under variable.
Aerodynamic Crocodile: So the basic form of the seems to follow is ah something like the floor...
Intergalactic Avenger: I guess the other thing we could do here is we could look at the size of the array. Sometimes it can help too. Because this is ultimately, this number here on the left is what we 're trying to correlate against. So one way to help sort of visualize this is that as we 're enumerated everything, there 's the sort of two - you can almost imagine as two kind of columns that are separated. So one is the allocation and the other is inserting the number.
Aerodynamic Crocodile: Right the allocation itself seems to be... intuitively it feels that it should be a log, it should be because end of the day we only end up allocating whenever we hit a power of two, so that is distributed across n as log n basically.
Intergalactic Avenger: I mean you 're right that it only happens log n number of times. But the question is when you add them all up, does it equal n log n, or does it equal something else?
Aerodynamic Crocodile: It shouldn 't be n log n. It looks like it should be log n... it should be log n plus the distance between the previous point, so for example... so this is two... I am totally blanking out here.
Intergalactic Avenger: Let me give you a little bit of a hint. So I could tell you 're on the right track so, and and you got the logs all in there.
Aerodynamic Crocodile: Yeah the secondary part...
Intergalactic Avenger: In terms of the time of allocation plus the time of insertion. It 's sort of how I was kind of showing these numbers here.
Aerodynamic Crocodile: So it 's definitely O(log N), that I feel comfortable with. Um the cutout is... maybe not...
Intergalactic Avenger: So the time of insertion is just all of these ones, because we 're always just inserting at the very end, we 've already done all the allocation and we 're just inserting at the very end. So to do this, to get up to n times ah we just added to n right? So it 's sort of like to get up to this number eight here, we had eight ones in that sort of time of insertion column. So the big question, is like now what is what is this thing? So if you think about it, if we were going to try to get up to n, how many like what would we be allocating? So first we allocate a 1, then we allocate a 2, then we allocate a 4, and then an 8, and then eventually you only have to allocate n. I guess this is more not the allocation, I mean this is the time of the the copy.
Aerodynamic Crocodile: Mem copy right.
Intergalactic Avenger: Time to copy. So at the first one, we 're just copying one, then two, then four, then eight, and then at the very end you 're only allocating, or you 're only copying at most, and are actually your copy is more like n/2 because you never copy, so like when you get so to build up that size of 8, the last copy you did was four. So when you look at this 8 here, to build up the length of the list of length 8, the last time you did a copy was for 4. So, what is this whole thing. So that 's the time of the copy, and then the time of insertion is just here to line everything up.
Aerodynamic Crocodile: So what is the left part that 's the question. It looks quadratic-ish.
Intergalactic Avenger: Well... I mean pick let 's pick a number right? Like let 's say we 're going to try to go to 10... to 256 or something all right. So 8 16 32 64 128...
Aerodynamic Crocodile: So that is something like a bunch of those... are powers of 2 basically but you 're adding up a bunch of powers of twos... what is that?
Intergalactic Avenger: Well what is 1 plus 2 in relation to the number 4?
Aerodynamic Crocodile: Oh it is essentially n minus 1... plus 2 is 3 which is 1 less than 4... Plus 1 plus 3 plus 4, 3 plus 4 is 7, which is again 1 less than 8... so that pattern is gonna follow basically all the way to I guess n, or 128.
Intergalactic Avenger: Yeah so if we want to ... this all the way to over n/2, what is that equal? Well what is it equal in terms of 256? So when we were trying to build up an array...
Aerodynamic Crocodile: It would be n times n minus 1 basically, wouldn 't it? So let 's sort of... let me limit myself to eight so that I can deal with small numbers. So okay it 's n minus 1 plus n here, n minus 1, plus n, which is 2 n minus 1, does this... so that would entail... but we are going up to log n divided by 2, so and there 's really n / 2 I guess. I guess the rest of the stuff cancels out. We are at n minus 1, oh and I guess plus the time of the insertion that we still have floating around... ah that boils down to basically 2 n minus 1 I guess.
Intergalactic Avenger: Yeah exactly. So...
Aerodynamic Crocodile: That is a really nice way of thinking about it I guess. It 's rather intuitive when you do it like that I guess.
Intergalactic Avenger: Yeah no it is. It 's funny right? So just to tie all the pieces together, when we did it the other anyway, it was how much? When you did the naive way how long did it take to insert everything?
Aerodynamic Crocodile: The naive way was basically copied over and um and yeah I guess it was O(N^2) time? Whereas here it is... are we including all of these inserts? We are right.
Intergalactic Avenger: Yeah, well so just I mean we 'd already worked it up from before. Yeah okay and the new way we have 2 n minus 1, just so we are totally clear here, which one 's better? I mean we already worked this out, I 'm just trying to tie everything together in your mind. So the the old way, i think we 'd even written it, and then the new way, I guess we we could we could just call these N 's. And which one 's better between these two yeah.
Aerodynamic Crocodile: One is quadratic the other is linear I guess, so the linear, O(2N - 1) would be considerably better.
Intergalactic Avenger: Okay very good, and and that 's kind of just to bring it back like what I was implying with this thing on the whiteboard, that it is in fact going to be a linear... that you you had even drawn this little spike that actually you don 't even need to worry about the spike, that if you are smart about how you do the the allocations and the copies, then you avoid the spikes altogether, and you just get linear time, no matter how you do it.
Aerodynamic Crocodile: Wow yeah, that 's a revealing to me.
Intergalactic Avenger: Yeah no it 's funny, like we all take for granted like how the inner workings of some of these very simple data structures work, but yeah sometimes the math underneath it can be kind of...
Aerodynamic Crocodile: Yeah you see a loop and you just need to say hey yeah that 's O(N). If you see two, that 's O(N^2).
Intergalactic Avenger: But actually, so this is a famous one in terms of series like, I mean it definitely sounds like you have a good gut sense for asymptotic notation stuff and series and logs and all that kind of stuff, but sometimes it can help to memorize some of these formulas and this is a famous one that when you are counting up powers of two, so this is like two to the zeroth with plus two to the first, two to the second... to two of the log n, just kind of a weird way of saying n. Then that just equals n. So that that 's an interesting one to keep in mind that even though you 're adding up powers and the powers are growing very quickly, the everything to the left of the number is completely overshadowed by the last thing.
Aerodynamic Crocodile: Right yeah there make sense.
Intergalactic Avenger: So that 's sort of an interesting property that comes out and that 's the reason why certain algorithms have very nice properties that they do, because if you can structure an algorithm so that it has this kind of pattern, then you know that basically you can just focus on this last term and how much time it takes, and you can completely ignore everything that happened previously. So that 's just sort of a little tip for this and this is called a geometric series.
Aerodynamic Crocodile: That rings a bell. The ones that I remember is basically stuff like Gauss 's formula, basically when you add up 1 to n, that was a thing. That was a pattern I was kind of somehow reconsidered this. Basically just add up 1 all the way to n, I thought that 's the one I had memorized, so I was trying to reconstruct that.
Intergalactic Avenger: Ah right, so there 's another one, this one is quite famous too. So actually you can rewrite this as just n. Up to n is...
Aerodynamic Crocodile: Yeah this is n times n minus 1 divided by 2.
Intergalactic Avenger: Awesome yeah, n times n minus 1 over 2, which is O(n^2), which is kind of in some ways seems almost counterintuitive that when you just add numbers in arithmetic series like this, it has a greater sum than the previous one that was increasing by powers of 2.
Aerodynamic Crocodile: Yeah exactly.
Intergalactic Avenger: The whole reasoning behind it is that if you look at the the length of how many there are, you know left to right, there 's there 's only log of N on on the top one, but on the bottom one... so it 's like how many terms are you actually adding up.
Aerodynamic Crocodile: So essentially in here becomes a log of n, so essentially we need to say O(log n) or squared, essentially just boils down to an n.
Intergalactic Avenger: Yeah exactly, so yeah these are just a couple useful series to kind of just have in your mind when when you 're going over this stuff. Cool. Alright, excellent well done! Thanks for sticking it out.
Aerodynamic Crocodile: Uh yeah, well thank you very much. Actually it was a lot of fun that I need to focus on. So this has been incredibly insightful, so. I tend to be an embedded programmer, so I don 't care about anything sciency about computer science, but these things, yes right. Most of what the stuff I deal with is O(1), so I don 't need to care. So yeah, I tend to like just forget to computer science stuff and just focus on actual like real work.
Intergalactic Avenger: Yeah I know, yeah I think sometimes. It it is kind of funny, right, there like even sometimes I mean I work in search, so I definitely deal with stuff that works on like larger algorithmic sets and stuff, but still like you know, you deal with this stuff very seldomly where you 're actually going to implement a list, that 's not what people do, but for better for worse that 's, what a lot of interview questions are and in this world, so it 's kind of like harkening back to those like cs101 classes. It 's kind of funny.
Aerodynamic Crocodile: Basically we would talk a lot about caches and stuff, I that was able to read all because that 's the stuff I deal with, I care about when I hit a cache miss or what takes up latency and things like that.
Intergalactic Avenger: Yeah exactly, cool all right, well good luck with everything.
Aerodynamic Crocodile: Thank you, I appreciate it! Bye.
Intergalactic Avenger: Bye.