A State Machine Simulator
This is it: the last day of NaCreSoMo 2014 and the finale of the CS mini-tutorial. We learned how to build a lexer. We learned what a state machine is, and how they are used to match or classify text. And we built a parser, using our lexer, that understands a language that describes state machines.
Today we’re going to go all the way: actually simulating a state machine. You can also just skip to the fun part: playing with the completed simulator.
Where we left off last time, we had a list of states, and each state had a label (name or number) and a list of transitions. What does it take to build an actual state machine out of that?
class StateMachine
First, we need some way to build one.
def initialize(states)
We have a list of all the states, but that makes it hard to jump from one state to another. Let’s build a map instead: a structure where you can look up a state by its name.
@states = {}
states.each do |state|
We should first check that we don’t have any duplicate names.
raise "duplicate state #{state.name}" unless @states[state.name].nil?
@states[state.name] = state
And we should also be looking for the start state. Otherwise, how will we know where to start?
if state.start?
raise "multiple start states: #{@start_state.name} and #{state.name}" unless @start_state.nil?
@start_state = state
end
On the flip side, if none of the states are start states, we also have a problem.
end
# We should have exactly one start state.
raise "missing start state" if @start_state.nil?
Now that we’ve built up a map with every state in it, we can check all the transitions to make sure they’re valid.
states.each do |state|
state.transitions.each do |transition|
destination = transition.destination_name
next if destination.nil? # A nil name represents an error transition.
raise "destination state #{destination} doesn't exist" if @states[destination].nil?
end
end
Okay, we’ve built a valid state machine! Note that we didn’t check for any final states: it’s valid to have a state machine with no final states (like the stop light machine).
end
Now that we have a state machine, we need to know how to run it. That’s a bit easier: we just go until we run out of input.
def match(input)
@current_state = @start_state
input = transition(input) until input.empty?
return final?
end
This is one of the great things about programming: you can write parts of the program assuming the rest of the program already works. We don’t actually know how to transition yet, but we can still write this match part. (As another example, the state machine uses the output of the parser, but it doesn’t have to know anything about the lexer.)
Actually writing the transition part isn’t that much harder:
def transition(input)
We start off by finding the first transition that matches the input. We do this in order because we want to be able to match some transitions (like "@") before others (like any symbol).
remaining_input = nil
transition_to_take = @current_state.transitions.find do |transition|
remaining_input = transition.match(input)
!remaining_input.nil? # If the match attempt didn't return "nil", it succeeded!
end
If we didn’t match anything, we should go to the error state, which we’ll represent with nil.
if transition_to_take.nil?
@current_state = nil
But otherwise we should look up the transition’s destination in the map of all states.
else
@current_state = @states[transition_to_take.destination_name]
end
If we end up in an error state, we should stop processing the input, so let’s pretend we’ve reached the end:
return "" if @current_state.nil?
Otherwise, the transition will have told us what input is left. We can send that back to continue being matched.
return remaining_input
And we’re done!
end
Here’s the entire StateMachine class, along with the new helpers we need to match plain-transitions and any-transitions.
class StateMachine
def initialize(states, whitespace: true)
@whitespace_sensitive = whitespace
# Use a mapping from state names to State objects,
# rather than just a list.
@states = {}
# Add all the states to the map.
states.each do |state|
raise "duplicate state #{state.name}" unless @states[state.name].nil?
@states[state.name] = state
if state.start?
raise "multiple start states: #{@start_state.name} and #{state.name}" unless @start_state.nil?
@start_state = state
end
end
# We should have exactly one start state.
raise "missing start state" if @start_state.nil?
# Check all the transitions.
states.each do |state|
state.transitions.each do |transition|
destination = transition.destination_name
next if destination.nil? # A nil name represents an error transition.
raise "destination state #{destination} doesn't exist" if @states[destination].nil?
end
end
end
def match(input, verbose: false)
@current_state = @start_state
until input.empty?
input = input.lstrip unless @whitespace_sensitive
puts "\tstate #{self.current_state.name}, input '#{input}'" if verbose
input = transition(input)
end
return final?
end
def transition(input)
# Find the first transition that matches the input.
remaining_input = nil
transition_to_take = @current_state.transitions.find do |transition|
remaining_input = transition.match(input)
!remaining_input.nil? # If the match attempt didn't return "nil", it succeeded!
end
if transition_to_take.nil?
@current_state = nil
else
@current_state = @states[transition_to_take.destination_name]
end
# Stop processing input if we're now in an error state.
return "" if @current_state.nil?
return remaining_input
end
def current_state
@current_state
end
def error?
@current_state.nil?
end
def final?
!error? && @current_state.final?
end
end
class PlainTransition
# A plain transition matches input if the input starts with the text.
def match(input)
# Index 0 is the first letter, which we know is a quote.
# Index -1 is the last letter, which should also be a quote.
# So we test letters 1 through -2...
if input.start_with?(@quoted_text[1..-2])
# ...and chop off the first <length> letters from the input if there's a match.
return input[(@quoted_text.length-2)..-1]
end
# Otherwise, return the special "nil" value for "no match".
return nil
end
end
class AnyTransition
def match(input)
# Chop off the first letter if we get a match.
case @kind
when "letter"
return input[1..-1] if letter?(input[0])
when "number"
return input[1..-1] if digit?(input[0])
when "space"
return input[1..-1] if space?(input[0])
when "symbol"
return input[1..-1] unless letter?(input[0]) || digit?(input[0]) || space?(input[0])
end
return nil
end
end
Whew! You can view the complete working source on Github.
That is it. We’ve built a language for state machines, and we can now turn a description of a state machine into a simulation.
Want to try it?
Remember:
- You can have more than one final state.
- State labels can be quoted text too (
"name"), not just numbers.
This version is built in CoffeeScript rather than Ruby. I condensed it a bit and used some more advanced features. You can find it on Github as well.
So that’s it! I’m…pretty sure I lost all of my readers in this tutorial. I’m definitely sure the best sections are the diagram-crazy, non-coding ones on state machines and regular expressions. But it was still fun getting here.
Extra credit: add a “description” production to the parser and turn the state machine into an adventure game.