2017-08-22
Regular expressions patterns are implemented as finite state machines! I find this pretty cool, and finding this out has helped me to understand both things a bit better.
This means that we can:
A finite state machine is just a model for understanding something. It models something that has different states, and that can only be in one of those states at one time.
For example, a door can either be locked or not locked:

We can add the conditions that cause the state to change:

A finite state machine is defined by:
If I wanted to search a string for substrings that:
ab's in the middlecI could search for the regex pattern ab+c (where + means the previous character can occur one or more times).
So:
abc, abbbc, and abbbbbbbbc would all match,ac and acb wouldn't match.We can draw this as a finite state machine:

Notes:
a causes the state to remain 0)If we want to find out if a string (say: aabc) contains our pattern, we will iterate through the string and keep track of our possible states. We'll start out with only one possible states: state 0. At each character in the string, we will update all possible states with the new input, removing states that do not match
We will iterate through each character like this:
| Current possible States | Input character | Update possible states |
|---|---|---|
| 0 | `a` | 1 |
| 0, 1 | `a` | 1 |
| 0, 1 | `b` | 0, 2 |
| 0, 2 | `c` | 0, 3 |
State 3 is one of our current states, so we know that the string contains our pattern!
If we wanted to find the matching substrings (ie. return the index of each match), we'd have to also keep track of those too.