Paisley Wallaby: Hello can you hear me?
Fresh Albatross: Hi, yes I can.
Paisley Wallaby: Great how are you doing?
Fresh Albatross: I'm good how are you doing?
Paisley Wallaby: I'm good, what language would you like to use?
Paisley Wallaby: Okay. Are you familiar with regular expression matching?
Fresh Albatross: Not really.
Paisley Wallaby: Okay. No problem. So the question is: write a function that takes two strings as arguments s and p and return a boolean denoting whether s matches p. So s is the string to match and p is the pattern. So p is a sequence of any number of the following: so you can have a lowercase letter which stands for itself, you can have a dot which matches any character, or you can have a star which matches zero or more occurrences of the previous single character. So the star always only applies to the previous character, there's no parenthesis or anything in this subset of regular expressions. So some examples….so in the first example the pattern is just 'ab' so there's only one string that would match this and s is not that string so it's false. In the second example the pattern is a* so s is just 'a's so that matches. In the third example the pattern is any string so s matches. And here the pattern is a single character so it would match any one character and s has length 2. Here the pattern is any number of 'c's followed by any number of 'a's followed by a one b so s does match because it has 0 'c's and 2 'a's. And finally this last example: the pattern is any number of 'a's followed by any one character and s does match. A couple notes: don't worry about performance at first just come up with the simplest most naive solution you can think of. I encourage you to write down your algorithm in pseudocode or in words before you start because that just helps you separate the problem solving from the details of how to write the code and people who do that tend to do better. As much as possible say what you're thinking while you're working on the code or on the algorithm because this is just as much about communicating how you solve problems as it is about how you can code. And also I encourage you to make up more test cases because this is just a small number of them.
Fresh Albatross: Yeah totally. Could I delete everything about line 31?
Paisley Wallaby: Go ahead and delete that, that's just boilerplate.
Fresh Albatross: And I just want to clarify: line 18 it works because there are 0 occurrences of 'c' any number of occurrences of a and one occurence of 'b' and that's why it is true.
Paisley Wallaby: Yep, that's right.
Fresh Albatross: Okay the first thing I'm going to think about it my inputs. Like you said they are going to be two strings. s is going to be the string that's basically….p is the pattern and currently I see s as what you're running against that pattern. So I don't know what else to call it I guess I'll just call it string. And the output is a boolean of whether it matches. Constraints and edge cases. The first edge case I can think of is: can I always assume that I'll have a valid string and valid pattern.
Paisley Wallaby: Yeah you can assume that the pattern will be valid and non-empty.
Fresh Albatross: Okay sounds good. Let's see. Okay currently the first thing that I'm thinking about is: well you provided three different conditions. So basically the one condition that I'm thinking about is from 'a' to 'z' and if it is a regular 'a' to 'z' then it's basically a direct match and that's fairly straightforward. And then with a dot it could be any character so I can consider this basically as a wildcard. I think this one might be really simple too. And the biggest challenge would probably be the star because it matches 0 or more occurrences of the previous character. I think what might be challenging with the star would be: whenever I see a star, I have to be like what's the previous character in the pattern string. Basically what I need to do is not try and direct match the previous character. And basically the other thing is you could have 0 of that character or more. Let me think about that. Right now I'm thinking I would have to find….line 14, right now I'm thinking about that pattern. I see it matches any character and then the star. So if you see a dot and a star, it could match literally anything. Because if ab passes then it's literally any number of occurrences of anything. So it's kind of like an edge case if you get this specific pattern: anything can match. And I'm thinking well, if I get this pattern between 'aa', 'bb' then it would have to be 'aa' then whatever in the middle, then 'bb' and that would be true?
Paisley Wallaby: Yeah
Fresh Albatross: Okay cool. So if the 'aa' and the 'bb' were not there then this would be false.
Paisley Wallaby: Right because it requires you to have two 'a's at the beginning and two 'b's at the end.
Fresh Albatross: Awesome okay cool. Alright so right now the biggest challenge is dealing with the star. Basically I think one thing that needs to happen for sure is the p string has to be split and kind of segmented out based on where you see dots and where you see starts. So what should I call my function? Should I call it like match()?
Paisley Wallaby: Sure.
Fresh Albatross: Maybe I'll call it isAMatch()...isMatch(). And like you said going to try...I'm going to put one example down here for now, not going to invoke it just yet because I have nothing in my function. 'c*a*b' Now I'm trying to think where should I split my, or how do I parse out my pattern string. So one thing I'm thinking of is using split and making an array of different matches. So for example wherever I see a star. So wherever there is a star, pull out the previous character. And I think you can just .split() to do that.
Paisley Wallaby: And you're welcome to look up documentation for library functions.
Fresh Albatross: Okay sounds good. Is it okay if I look up split() for now?
Paisley Wallaby: Yeah.
Fresh Albatross: Awesome cool. Split the string into an array of strings by separating the string by a given substring. It takes a separator. However, if I put a separator as a star it's going to get rid of the star altogether. So for example if I put in split('*'), it's going to give me ['c', 'a', 'b'] but it doesn't necessarily. Well let me think about this let me try….let me think about 'a' 'b'. Is it okay if I test this out?
Paisley Wallaby: Yeah.
Fresh Albatross: I'll try doing a split. Okay so it splits it up. So wherever there's a split I should take the previous value and apply the star to it. However the only issue I see is what if….I'm going to try something else. What if I go 'ac*ab*bb' so right now I expect the 'ac' 'ab' and 'bb' split but the problem is...so it splits them into 3 elements but the ones that the star applies to are the first 'c' the first 'b' and that's it. Okay so wherever a split() happens the last character of the previous string is the one I have to apply star to. Okay: when splitting string apply star to the last character of all elements except the last element. Because there's a split on the last one just because its the last one but that doesn't mean anything happens to it. And the surest example I have is 'a*b' and 'b' doesn't have a star after it.
Paisley Wallaby: But what if you had something like: 'ac*ab*bb*'?
Fresh Albatross: Oh I think that might work because what that ends up doing is it creates an empty element at the last.
Paisley Wallaby: Oh I see.
Fresh Albatross: Yeah you see like there: you see there's an empty string. So that would actually work.
Paisley Wallaby: Okay.
Fresh Albatross: So I think that works. Let's see: so I am going to say first step is basically split the pattern by '*' I guess? Split pattern by star. For every element in the split array, I would iterate over every character. If the character is a match….wait actually: if the character is the last one run the star matching pattern otherwise it's basically a direct match or wildcard. So what that means is if 'a' to 'z' check for direct match. If wild card then just accept any value. Actually I'll just leave it as a conditional just so that I have a visual idea of what's going on. So I'm going to try that, just walk through my pseudocode with the 'ac*ab*bb' example?
Paisley Wallaby: Sure.
Fresh Albatross: So I'll need to think of a sample input string as well. So let's say my input string is going to be: 'acccabbbbbb' so this should take. So 'a' would direct match so if I split the pattern by star then I'm going to get ['ac', 'ab', 'bb'] let me know if you'd like me to change my strategy in any way I'm just walking through it right now.
Paisley Wallaby: No, I'm with you so far.
Fresh Albatross: So for every element in the split array, so basically iterating over every element here, I'm going to iterate over every character so since the character isn't the last one for the first 'a'. Oh I just realized that I'll probably need some kind of stringIndex I'll initialize it to 0 and this will represent the original variable s's index. So let's say it's an 'a', it's a direct match from 'a' to 'z' it's going to check for a direct match and if it is you need to just keep going and also you need to increment the string index by one. If it's a wildcard it's the same way. You increment stringIndex by one and the else cases should be....then you can consider the fail case which is just return false immediately. Okay yep so it's going to increment stringIndex by 1 then it's going to go to 'c' and because it is the star pattern it's going to say: it's a character 'a' to 'z' what you need to do then is say if direct match….the first thing you see is if it is a direct match and if it's not you just return false. But if the first letter is a direct match, while it's still a direct match, you would just keep incrementing the stringIndex. So for example if I have 'ccc' it should just go 'c' 'c' 'c' and then match it like that. The only issue I have right now is: what if I had 'c*c' and wondering if that works. Actually it would work because it could count as 0 'c's and that could work. Oh now that I mentioned it if 0 works it wouldn't return false it would just take whatever and it wouldn't return false. Does that make sense?
Paisley Wallaby: Yeah.
Fresh Albatross: So yeah, if it's 'c*' and there are no 'c's it hops over that case. Oh sorry it doesn't hop over, I'm going back to the original example. So it goes to all 'c's so now my current index is going to be 4 and then it checks for 'ab' and if it finds the 'a' it'll be fine with that. It's going to go 'b' and that's the wildcard again. Actually there is an issue: what if I count all the 'b's and then I go to my next string and I'm basically out of indices. So there needs to be some kind of splitting to check for different cases. Does that make sense?
Paisley Wallaby: I think so. You mean: so like looking at line 78 if you are matching the first 'b*' and you use up all the 'b's in the string and then you're at the empty string..yeah okay I see what you mean.
Fresh Albatross: That's interesting, let's see. But it needs to kind of look ahead as well which is not being accounted for. Trying to think of a way to cleverly look ahead. Nothing is coming to mind right now. If I'm at the biggest 'b' then I need to check the next 'b'. What I'm thinking of is: if the next 'b' is just a regular character without a star, then it needs to account for that fact. But if the next character is also another star character then it could just take 0 characters as well so I think there must be some kind of recursion or some kind of branching to see if there are different pathways you can take.
Paisley Wallaby: Yeah I think you're on the right track there and think about using recursion and basically what the different pathway is like what choice you're making at every step.
Fresh Albatross: Exactly. So something in my function between lines 57 and 58 needs to be updated. There needs to be some kind of recursion happening here. It's not very descriptive but I'm going to change that in a bit. So let's see. Well it's still a direct match. I'm thinking I just need to in my recursive function, needs to take in the remaining string I think. Maybe it could call itself like call isMatch() and instead of taking in….well actually no it could still take it in. It could take the remaining string and it could take...instead of p because now it's a split pattern. It's not a string anymore it's just split by whatever it is and I need to grab the remainder which is going to be challenging I feel like.
Paisley Wallaby: The remainder of the pattern you mean?
Fresh Albatross: Right, the remainder of the pattern. So I need to somehow grab the remainder of the pattern somehow. So right now let's assume I'm at 'c', how would I know where the remaining pattern is. Well actually I could figure that out. The remainder of the pattern would kind of be like...cause I'm still iterating over every element based on line 51 so I could just grab and basically join up remaining elements after and also each character after the starting character. Well, if it's a start case it wouldn't have anything so I'll just join up the remaining elements that come after the current element I'm in. So basically do an array join of elements after the current element. Okay that wasn't as bad as I thought it would be. remainingString should be relatively straightforward because we have the stringIndex so you would run isMatch() and basically store it as a value call it like foundMatch or something and if in any case. If foundMatch then you would return true. That means in all my recursive cases at least one of them have past already. It would be something like this I believe. I also need to…and this for if it's an 'a' to 'z' character. I then need to account for star characters as well. Okay so I have the 'a' to 'z'. If dot...so this one is interesting it doesn't have to look for a match it literally just calls the recursive function again and again and again. Basically whatever was in here, just paste it in here kind of. Because like I mentioned it just matches anything and it could be 0 or anything it could be 99 of anything. And I think that should work but I'm going to keep walking through my example up here.
Paisley Wallaby: Yeah sure.
Fresh Albatross: So when I'm at 'b' the stringIndex is 4 basically what it's going to do at b is call a direct match it would call isMatch() on the remaining string and on the remainder of the pattern for every possible number of 'b' so like 1 'b', 2 'b', 3 'b', 4 'b' and then if it finds a match for any of these instances then it returns true to the original call. I also need a fallback case at the very end for just returning true if I've iterated over the entire string and I found no falses then I should just return true. Actually the first case would work too because everything that comes after 'b' is just 'b' and 'b*'. Theoretically nothing should go wrong even for this, like it would just end the recursive call on the first recursion. Does that make sense?
Paisley Wallaby: Yeah.
Fresh Albatross: Alright, I think this would be my solution. Do you have any suggestions or recommendation?
Paisley Wallaby: Let's see. I guess I'm not entirely clear what recursive calls you're making because you talked about trying each possible number of characters in the the recursive calls but I'm not seeing that represented here as a loop or anything. Yeah okay.
Fresh Albatross: So basically I have a while loop on line 53 and so when while the character...so if it's a regular character from 'a' to 'z' if it's a direct match while it's still a direct match so let's say we're looking at the 'b' wildcard basically anything that comes after is the previous example. Basically when I'm at this index I want to run isMatch() on the remaining string so it would be 'bbbbb' and basically the remainder of the pattern would be any of the elements that come after the current array element. So the first recursive call would be something like 'bbbbbb' and the remaining pattern would just be...it would be joined with stars in the middle. It would be 'bb*' like that. And then it would check for this case and recursively keep going and try.
Paisley Wallaby: Okay absolutely.
Fresh Albatross: Okay so should I move on to coding this...Or should I….
Paisley Wallaby: Yeah.
Fresh Albatross: Okay. For every element in my split array, so let's say let patternArr = p.split('*'); so for every element, let's call it i for now. This will iterate over. For every element in split array, I would pull out the subString which would be just. So is it okay if I test my function as I go? It's a really meaty function.
Paisley Wallaby: Yeah.
Fresh Albatross: Okay. Thank you. Let's try that: 'ac', 'ab', 'bb'. That's what I thought. Awesome. I'm also going to reset the console. So I'm going to iterate over every character, so that can't be a character. So char is going to be starting at 0 and it's going to the end of the substring and increment like so. If char is equal to subString.length - 1 which is basically the last character. Then I run to the star cases. Then I would run the matching pattern. How do I tell if it's a 'a' to 'z' character? Let me check if there's a library function for that. There is a helper function that contains regular expressions not sure if I should use that.
Paisley Wallaby: You're welcome to use it if not what's a lowercase letter.
Fresh Albatross: Okay only for lowercase letters you said?
Paisley Wallaby: Yeah.
Paisley Wallaby: Okay.
Fresh Albatross: If char is a direct match to s...so I need to also initialize my stringIndex. So if(char === s[stringIndex]) then while (char === s[stringIndex]) then let remainingPattern would just be patternArr...I need to grab a slice of the array from i + 1, that's going to be patternArr.slice(i + 1).join('*'). It never got to the remaining pattern, interesting. So right now I'm checking to see why it never hit that pattern or that case. And I'm also going to add another console.log() to see if I hit every case. And then...okay it does get there at the correct time so if(char === s[stringIndex]) so the char itself is initialized. Oh it's because I'm never getting there yet because it's not counting for the first case, like letter 'a' of my first case. Let me take that out and see if that works. Okay I think it worked but then it caused that problem. So I'm getting 'ab*bb*' okay it's doing what I want it to do but it's also causing an infinite loop. Basically what I would do is I would join it with everything and basically say let foundMatch = isMatch(remainingString, remainingPattern) So let's test this and it's probably going to throw infinte loop but let's see if this works. Okay yep it's doing what I think it's doing. So I'm going to pass remainingString in there and if I ever get a value that is a match I would just return true. Then stringIndex++ here. So this checks for if the character is a direct match and basically if it's not a direct match then it must be a wild character then it would go in here and what would end up happening is you wouldn't have to check for if it's a direct match this entire piece right here would just be replicated down here like that. Does that make sense?
Paisley Wallaby: Yeah that makes sense.
Fresh Albatross: Okay cool I think that should be fine. So I need to account for the cases where it's not the last index, which should be relatively straightforward. So again, basically if it's a direct match...if it's a lowercase char...you have to check for if the stringIndex and if it's not true return false otherwise do nothing. Otherwise it increments the stringIndex by one. Otherwise it accepts any value which should mean that increments. So if that's the case I can take out this else block and just pull this down. I think that's it. Although to be fair I'm quite nervous because there's a lot of code and I'm not sure if it will all work. Well it returns true so let me try adding something that might complain like a 'c' at the end. Okay it still returns true so there's something wrong with this. Okay would you like me to go ahead and debug this?
Paisley Wallaby: Yeah we have about fifteen minutes but let's try to debug it.
Fresh Albatross: Awesome let's see. So wherever it ends is basically where true is returning too early. Let's see where it ends. So it tests until it gets to 'bb*', so it gets to 'bb*' and it returns true somehow. So when s is 5 'b's and 1 'c' and p is 'b*' I'm going to check the else case. Let me actually reduce the size of my input it's getting a little difficult to see.
Paisley Wallaby: Yeah good idea.
Fresh Albatross: Let me try a simple case for 'abc' first. So this should return true. So let's try a simple case for failing 'abcd'. Okay so this is interesting so there's something in my lower half of code that's causing it to fail. So it fails at stringIndex === 1 which is odd to me. Well it's not p, it's patternArr[i] oh that's the entire thing it's not patternArr[i] it's character. So at d it fails, or it exits. Oh it's because it's the last element so what's happening is it's getting to the last element of my split array so for example it's just abc right now but since it's the very very last element it's basically treating it as a wildcard 'c' which shouldn't be. So it goes 'a', 'b' then gets to 'c' and it treats 'c' as a wildcard well that's one bug but it shouldn't evaluate to true because when it gets to 'd' it should still go to false. Okay there are two bugs that I've noticed. The first one is basically accounting for the last array element has to be treated a little more specially, and the second thing is basically what if there's more to the string outside of the match. So for example it doesn't get to check 'd ' because it gets to the end and says everything is fine. So one thing I could check for is basically right off the bat tackle...if there's an actual character remaining then it should return false otherwise it returns true. I think it would cause that to...yeah there we go. So that's fine but now I need to account for the last array element as well. It would be this check in line 55. So the first thing is basically you need to say is: hey is this the last part of your substring and also i should not be the last element of your patternArr and I think that should be good. So let's try the original case again let's try 'abcd' and let's try 'abcd*' and see what happens that should return true. Let's try 'abcde' that should be false. Let's try 'abcd*f*g*' with 'abcddddffffgggg' and that should be true. And let's add an 'h' at the end and that should fail. Oh let me try cases with the dots.
Paisley Wallaby: I think this might be a good place to stop since we only have a couple minutes left I know there's more test cases but you've done a good job so far. Any questions?
Fresh Albatross: Yeah what did you think I did well and what do you think I could improve on?
Paisley Wallaby: Yeah I think you did pretty much everything well. You did a good job talking through the problem initially, did a great job specifying your algorithm upfront most people don't do it that clearly. You have comments in your code which is great and I like how you walk through the pseudocode manually with an example. I don't have that many suggestions for things to improve since you did so well but one minor thing is that on line 62 where you do the recursive call you're sort of unparsing the pattern then parsing it again when you make the call. You could just have a helper function and keep it in its parsed form, but that's really minor. I guess the other thing is like you spent about half an hour before writing code and most people have the opposite problem: they start on the code too soon. But I think just in terms planning out the time you could have started that a little bit earlier but I think it's a bigger problem when people write code too fast so that's not a big deal either.
Fresh Albatross: Sounds good okay.
Paisley Wallaby: Yeah thanks that's really helpful actually.
Fresh Albatross: Great. Nice talking to you and enjoy the rest of your day.
Paisley Wallaby: See you. Thank you so much.
Fresh Albatross: Yeah no problem, bye
Paisley Wallaby: Bye