\documentclass[12pt, a4paper]{article}

\usepackage{a4wide}
\usepackage{amsmath,amssymb, amsthm}
\usepackage{listings}
\usepackage{parskip}
\usepackage{pgf}
\usepackage{subfig}
\usepackage{tikz}
\usetikzlibrary{arrows,automata}

\lstdefinestyle{haskell}
    {language=Haskell,
      columns=fullflexible,
      breaklines=true,
      frame=single
    }
\lstnewenvironment{haskell}{\lstset{style=haskell}}{}

\newcommand{\defn}{\textbf}
\newcommand{\secref}[1]{$\S$\ref{#1}}

\newcommand{\comprehend}{\;|\;}
\newcommand{\powerset}[1]{\mathcal{P}(#1)}
\newcommand{\suchthat}{\;\bullet\;}

\newcommand{\states}{S}
\newcommand{\state}{p}
\newcommand{\stateAlt}{q}
\newcommand{\stateAltAlt}{r}

\newcommand{\alphabet}{\Sigma}
\newcommand{\letter}{a}

\newcommand{\strings}{\Sigma^*}
\newcommand{\emptystring}{\lambda}
\newcommand{\word}{w}
\newcommand{\wordAlt}{x}
\newcommand{\wordAltAlt}{y}

\newcommand{\transition}{\curvearrowright}
\newcommand{\transitionset}[2]{#1_{\stackrel{\transition}{#2}}}
\newcommand{\goesto}[3]{#1\!\stackrel{#2}{\transition}\!#3}

\newcommand{\stringtransition}{\twoheadrightarrow}
\newcommand{\stringgoesto}[3]{#1\!\stackrel{#2}{\stringtransition}\!#3}

\newcommand{\destinationset}[2]{#1_{\action{#2}}}
\newcommand{\destinationequiv}[1]{\sim_{#1}}

\newcommand{\action}[1]{\overrightarrow{#1}}
\newcommand{\actionequiv}{\sim}

\title{\texttt{fsmActions} -- a Haskell library for\\finite state
  machines \& FSM actions}

\author{Andy Gimblett\\\texttt{haskell@gimbo.org.uk}}

\date{version 0.4.0 --- October 21, 2009}

\begin{document}

\maketitle


\emph{Please note that \secref{sec:implementation} of this document is
  falling somewhat behind reality; however, the library's Haddock
  documentation is up to date and fairly complete -- and the rest of
  this document is accurate and relevant.}


\section{Introduction}

This is a library for representing and manipulating \emph{finite state
  machines} (FSMs) in Haskell, with an emphasis on computing the
effects of sequences of transitions across entire machines (which we
call \emph{actions}), and in particular investigating \emph{action
  equivalences} between such sequences.

The motivation for writing this library is investigating models of
user interfaces; in this context, states are implicit, transitions
correspond to UI events (e.g. button presses), and sequences of
transitions correspond to sequences of user actions.  We're interested
in comparing actions, which are the effects of sequences of
transitions across the whole device (for example, noticing when some
action is in fact an \emph{undo}); for that we need a representation
geared towards such comparisons -- hence this library, and its
idiosyncratic view of what's important about FSMs, which differs
somewhat from alternative definitions the reader may be familiar with.
E.g. we are unconcerned with start/accepting states (as found in
finite automata), and -- for now at least -- output functions (as
found in finite state transducers).  We are interested only in finite
machines with total transition functions, else we might call our
machines labelled transition systems.  Future versions of the library
might add some or all of these features, particularly if someone else
has an itch to scratch.  More generally, perhaps someone else will
find this useful, hence its release as a library.

The rest of this document is structured as follows.  Section
\ref{sec:definitions} introduces and defines the mathmatical
abstractions on which the library is based.  Section
\ref{sec:implementation} outlines the current implementation in
Haskell, though the full details are left to the Haddock
documentation.  Section \ref{sec:future_work} suggests future work.
Section \ref{sec:history} describes the history of the library.


\section{Definitions}
\label{sec:definitions}


\subsection{Finite state machines}
\label{sec:basic_definitions_fsm}
\label{sec:eliminating_implicit_self_loops}

For the purpose of this library, a \defn{finite state machine} (FSM)
is a tuple $(\states, \alphabet, \transition)$ where $\states$ is a
finite set (of states), $\alphabet$ is a finite set (of symbols)
called the system's \defn{alphabet}, and $\transition \; : \states
\times \alphabet \rightarrow \powerset{\states}$ is a total function,
called the machine's \defn{transition function} (we have chosen to
define $\transition$ as a total function to the powerset of states, as
this unifies the treatment of deterministic and nondeterministic
machines).

Given $\state \in \states$ and $\letter \in \alphabet$, we call the
set of states in $\transition\!(\state, \letter)$ the \defn{transition
  set} for $\letter$ at $\state$, written
$\transitionset{\state}{\letter}$.

We introduce an infix shorthand for single \defn{transitions}:
\begin{quote}
  $\forall \; \state, \stateAlt \in \states \suchthat \forall \;
  \letter \in \alphabet \suchthat \goesto{\state}{\letter}{\stateAlt}
  \iff \stateAlt \in \transitionset{\state}{\letter}$
\end{quote}
Here, $\state$ and $\stateAlt$ are called the \defn{source} and
\defn{destination} states of the transition, respectively, and
$\letter$ is called its \defn{label}.


\subsubsection*{Example}

%% Adapted from http://www.texample.net/tikz/examples/state-machine/
\begin{figure}
  \centering
  \subfloat[A finite state machine]{
    \begin{minipage}[b]{0.4\textwidth}
      \label{fig:deterministic_fsm_full}
      \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=2.8cm,
                          semithick]
        \tikzstyle{every state}=[fill=blue,draw=none,text=white]
        \node[state]         (A)                    {$A$};
        \node[state]         (B) [above right of=A] {$B$};
        \node[state]         (C) [below right of=B] {$C$};
        \node[state]         (D) [below right of=A] {$D$};
        \path (A) edge              node {x} (B)
                  edge              node {y} (C)
              (B) edge [loop above] node {x} (B)
                  edge              node {y} (C)
              (C) edge [loop right] node {y} (C)
                  edge              node {x} (D)
              (D) edge [loop below] node {x} (D)
                  edge              node {y} (A);
      \end{tikzpicture}
    \end{minipage}}
  \hspace{0.08\textwidth}
  \subfloat[With self-loops removed]{
    \begin{minipage}[b]{0.4\textwidth}
      \label{fig:deterministic_fsm_noselfloops}
      \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=2.8cm,
                          semithick]
        \tikzstyle{every state}=[fill=blue,draw=none,text=white]
        \node[state]         (A)                    {$A$};
        \node[state]         (B) [above right of=A] {$B$};
        \node[state]         (C) [below right of=B] {$C$};
        \node[state]         (D) [below right of=A] {$D$};
        \path (A) edge              node {x} (B)
                  edge              node {y} (C)
              (B) edge              node {y} (C)
              (C) edge              node {x} (D)
              (D) edge              node {y} (A);
      \end{tikzpicture}
    \end{minipage}}
  \caption{Two graphical representations of the same FSM}
  \label{fig:deterministic_fsm}
\end{figure}

Consider the FSM illustrated in figure
\ref{fig:deterministic_fsm}\subref{fig:deterministic_fsm_full},
i.e. $(\states, \alphabet, \transition)$ with:
\begin{eqnarray*}
  \states     & = & \{A, B, C, D \}\\
  \alphabet   & = & \{ x, y \} \\
  \transition & = & \{ \goesto{A}{x}{B},
                       \goesto{A}{y}{C},
                       \goesto{B}{x}{B},
                       \goesto{B}{y}{C},
                       \goesto{C}{x}{D},
                       \goesto{C}{y}{C},
                       \goesto{D}{x}{D},
                       \goesto{D}{y}{A}
                    \}
\end{eqnarray*}
Note abuse of notation in enumerating $\transition$ here; properly it
would written as
\begin{eqnarray*}
  \transition & = & \{ (A, x) \mapsto \{ B \}, (A, y) \mapsto \{ C \} , (B, x) \mapsto \{ B \}, (B, y) \mapsto \{ C \}, \cdots \}
\end{eqnarray*}
So, e.g. $\transitionset{A}{x} = \{ B \}$, $\transitionset{A}{y} = \{
C \}$, etc.

This FSM includes a number of \defn{self-loops},
e.g. $\goesto{B}{x}{B}$.  Because $\transition$ is total, self-loops
may in general be omitted from FSM representations without difficulty:
provided $\states$ and $\alphabet$ are known, the self-loops may be
easily inferred.  For example, figure
\ref{fig:deterministic_fsm}\subref{fig:deterministic_fsm_noselfloops}
illustrates the same FSM without self-loops, and we could write it out
as:
\begin{eqnarray*}
  \states     & = & \{A, B, C, D \}\\
  \alphabet   & = & \{ x, y \} \\
  \transition & = & \{ \goesto{A}{x}{B},
                       \goesto{A}{y}{C},
                       \goesto{B}{y}{C},
                       \goesto{C}{x}{D},
                       \goesto{D}{y}{A},
                    \}
\end{eqnarray*}


\subsection{Nondeterminism}
\label{sec:nondeterminism}
\label{sec:nondeterminism_and_self_loops}

An FSM $(\states, \alphabet, \transition)$ is \defn{nondeterministic}
(an NFSM) if the transition set for some word at some state has more
than one member.  Equivalently, if:

\begin{quote}
  $\exists~\state, \stateAlt, \stateAltAlt \in \states \wedge
  \exists~\letter \in \alphabet \suchthat
  \goesto{\state}{\letter}{\stateAlt} \wedge
  \goesto{\state}{\letter}{\stateAltAlt} \wedge \stateAlt \neq
  \stateAltAlt$
\end{quote}

Otherwise the machine is \defn{deterministic} (a DFSM).


\subsubsection*{Examples}

\begin{figure}
  \centering
  \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=2.8cm,
                      semithick]
    \tikzstyle{every state}=[fill=blue,draw=none,text=white]
    \node[state]         (E)                    {$E$};
    \node[state]         (F) [above right of=E] {$F$};
    \node[state]         (G) [below right of=F] {$G$};
    \node[state]         (H) [below right of=E] {$H$};
    \path (E) edge              node {k} (F)
              edge              node {k} (G)
              % edge [loop left]  node {j} (E)  % implicit self-loop
          (F) edge              node {j} (G)
              %edge [loop above] node {k} (F)   % implicit self-loop
          (G) edge              node {k} (H)
              % edge [loop right] node {j} (G)  % implicit self-loop
          (H) edge              node {j} (E)
              % edge [loop right] node {k} (H)  % implicit self-loop
              edge [loop below] node {j} (H);   % explicit self-loop
  \end{tikzpicture}
  \caption{A nondeterministic FSM}
  \label{fig:nondeterministic_fsm}
\end{figure}

Both machines in figure \ref{fig:deterministic_fsm} are deterministic.
Figure \ref{fig:nondeterministic_fsm} illustrates a nondeterministic
FSM where $\states = \{ E, F, G, H \}$ and $\alphabet = \{ j, k\}$
(with implicit self-loops omitted, e.g. $\goesto{E}{j}{E}$).  Here,
for example, we have $\transitionset{E}{k} = \{ F, G \}$, i.e. both
$\goesto{E}{k}{F}$ and $\goesto{E}{k}{G}$.

In \secref{sec:eliminating_implicit_self_loops} we stated that
self-loops may in general be omitted from FSM representations, and
inferred as required.  With NFSMs, some care must be taken.  Consider
the example in figure \ref{fig:nondeterministic_fsm}; as well as four
implicit self-loops, there is also an \emph{explicit} self-loop
$\goesto{H}{j}{H}$.  This \emph{cannot} be removed, because it is a
source of nondeterminism, along with $\goesto{H}{j}{E}$.


\subsection{Strings and destination sets}

We adopt the standard definition of \defn{strings} over an alphabet.
Specifically, if $\alphabet$ is an alphabet then $\strings$, the set
of strings over $\alphabet$ is the smallest set such that:

\begin{enumerate}

\item $\strings$ contains the empty string: $\emptystring \in
  \strings$

  \item $\word \in \strings \wedge \letter \in \alphabet \implies
    \word\letter \in \strings$

\end{enumerate}

Then the transition function $\transition \; : (\states \times
\alphabet) \rightarrow \powerset{\states}$ may be lifted to the
\defn{string transition function} $\stringtransition \; : (\states
\times \strings) \rightarrow \powerset{\states}$ as follows:

\begin{enumerate}

  \item $\stringtransition\!(\state, \emptystring) = \{ \state \}$

  \item $\stringtransition\!(\state, \word\letter) = \bigcup \; \{
    \transitionset{\stateAlt}{\letter} \comprehend \stateAlt \in
    \destinationset{\state}{\word} \}$

\end{enumerate}

Given $\state \in \states$ and $\word \in \strings$, we call the set
of states in $\stringtransition\!(\state, \word)$ the
\defn{destination set} for $\word$ at $\state$, written
$\destinationset{\state}{\word}$.

As with single symbols, we introduce an infix shorthand for single
\defn{string transitions}:
\begin{quote}
  $\forall \; \state, \stateAlt \in \states \suchthat \forall \;
  \word \in \strings \suchthat \stringgoesto{\state}{\word}{\stateAlt}
  \iff \stateAlt \in \destinationset{\state}{\word}$
\end{quote}

Given $\state \in \states$, two strings are said to be
\defn{destination equivalent} at $\state$, written
$\destinationequiv{\state}$, if their destination sets at $\state$ are
identical:

\begin{quote}
  $\forall \; \state \in \states \suchthat \forall \; \word, \wordAlt
  \in \strings \suchthat \word \destinationequiv{\state} \wordAlt ~
  \Longleftrightarrow ~ \destinationset{\state}{\word} \; = \;
  \destinationset{\state}{\wordAlt}$
\end{quote}


\subsubsection*{Examples}

Consider the DFSM in figure \ref{fig:deterministic_fsm}; here we have,
for example $\stringgoesto{A}{xyxxx}{D}$ and
$\stringgoesto{C}{xyxxx}{B}$.

Consider the NFSM in figure \ref{fig:nondeterministic_fsm}; here we
have (for example) $\destinationset{E}{kkjjk} = \{ H, F \}$, so that
both $\stringgoesto{E}{kkjjk}{H}$ and $\stringgoesto{E}{kkjjk}{F}$.


\subsection{Actions and action equivalence}
\label{sec:actions}

Given an FSM $(\states, \alphabet, \transition)$, for every $\word \in
\strings$, $w$'s \defn{action}, written $\action{\word} \; \subseteq
\states \times \powerset{\states}$ captures its effect across every
state in $\states$:

\begin{quote}
  $\action{\word} = \{ (\state, \destinationset{\state}{\word})
  \comprehend \state \in \states \}$
\end{quote}

Thus, an action captures the effect of a string of transitions, across
all states in the machine.

An action is \defn{deterministic} if, for every one of its pairs, the
destination set found in the pair's second element has only a single
member; otherwise it is \defn{nondeterministic}.  We state without
proof that an FSM is deterministic if and only if all of its actions
are deterministic.

Two strings are said to be \defn{action equivalent}, written
$\actionequiv$, if their actions are identical:

\begin{quote}
  $\forall \word, \wordAlt \in \strings \suchthat \word \actionequiv
  \wordAlt ~ \Longleftrightarrow ~ \action{\word} \; = \;
  \action{\wordAlt}$
\end{quote}


\subsubsection{Examples}

Consider again the FSM in figure \ref{fig:deterministic_fsm}.  Here
are some actions over this machine:
\begin{eqnarray*}
  \action{x}   & = & \{ (A,\{B\}), (B,\{B\}), (C,\{D\}), (D,\{D\}) \} \\
  \action{y}   & = & \{ (A,\{C\}), (B,\{C\}), (C,\{C\}), (D,\{A\}) \} \\
  \action{xx}  & = & \{ (A,\{B\}), (B,\{B\}), (C,\{D\}), (D,\{D\}) \} \\
  \action{xy}  & = & \{ (A,\{C\}), (B,\{C\}), (C,\{A\}), (D,\{A\}) \} \\
  \action{xxx} & = & \{ (A,\{B\}), (B,\{B\}), (C,\{D\}), (D,\{D\}) \} \\
  \action{xyy} & = & \{ (A,\{C\}), (B,\{C\}), (C,\{C\}), (D,\{C\}) \}
\end{eqnarray*}
Notice that $xx \actionequiv xxx$, for example, and that all of these
actions are determinstic -- as we would expect as the FSM is
determinstic too.

Conversely, consider the nondeterministic FSM in figure
\ref{fig:nondeterministic_fsm}.  Here we have some nondeterminstic
actions, e.g.
\begin{eqnarray*}
  \action{j}   & = & \{ (E, \{ E \}),    (F, \{ G \}),    (G, \{ G \}),    (H, \{ E, H\}) \} \\
  \action{jk}  & = & \{ (E, \{ F, G \}), (F, \{ H \}),    (G, \{ H \}),    (H, \{ F, G, H \}) \} \\
  \action{jkj} & = & \{ (E, \{ G \}),    (F, \{ E, H \}), (G, \{ E, H \}), (H, \{ E, G, H \}) \}
\end{eqnarray*}


\section{Implementation}
\label{sec:implementation}


\emph{Please note that this section is falling somewhat behind
  reality; however, the library's Haddock documentation is up to date
  and fairly complete.}


\subsection{\texttt{Data.FsmActions}}


\subsubsection{States, destination sets, actions, and words}

The current Haskell implementation is focused strongly on actions, and
makes some simplifying assumptions about FSMs.  In particular, the
members of the set $\states$ of states are integers, counting upwards
from 0 (so e.g. in an FSM with 5 states, $S = \{ 0, 1, 2, 3, 4 \}$.
In the current implementation, $\states$ is not represented
explicitly, but rather implicitly, as a property of the actions in the
FSM (see below).

\begin{haskell}
type State = Int
\end{haskell}
A destination set then consists of a set of integers, where all
members of the set must be smaller than the number of states in the
FSM.  In the current implementation, we represent such sets as plain
Haskell integer lists:

\begin{haskell}
newtype DestinationSet = DestinationSet {
      destinations :: [State]
    } deriving (Eq, Ord, Show)
\end{haskell}

Then an action is just a list of destination sets; the index of each
destination set within the list is the set's corresponding source
state; for example, the first entry in the list contains the
destination set for state 0.  Contrast this with the definition in
\secref{sec:actions}, where source states are explicitly paired with
their destination sets.

\begin{haskell}
newtype Action = Action {
      destinationSets :: [DestinationSet]
    } deriving (Eq, Ord, Show)
\end{haskell}

Actions may be constructed either directly using the \texttt{Action}
constructor, or via convenience functions obviating the need to
decorate destination lists with the \texttt{DestinationSet}
constructor.

\begin{haskell}
mkAction :: [[State]] -> Action
mkDAction :: [State] -> Action
\end{haskell}

Finally, we introduce a type, \texttt{Word}, for sequences of symbols.
\begin{haskell}
newtype Word sy = Word [sy]
\end{haskell}


\subsubsection{Finite state machines}

A finite state machine is then a mapping from alphabet symbols to
actions; each action tells us the effect, over the whole machine, of a
particular singleton symbol.  Note that this is parametric over the
symbol type --- typically we expect to use \texttt{Char}s or
\texttt{String}s.

\begin{haskell}
newtype FSM sy = FSM {
      unFSM :: M.Map sy Action
    } deriving (Eq, Ord, Show)
\end{haskell}

It's then easy to compute the machine's set of states (from the length
of the first action found in the map, assuming they're all the same
length -- see \secref{sec:wellformedness}) and its alphabet (just the
keys of the map); but note that both of these results are lists, not
sets.

\begin{haskell}
states :: FSM sy -> [State]
alphabet :: FSM sy -> [sy]
\end{haskell}

For convenience, we expose the \texttt{lookup} function of an
\texttt{FSM}'s underlying \texttt{Map}.

\begin{haskell}
fsmAction :: Ord sy => sy -> FSM sy -> Maybe Action
\end{haskell}


\subsubsection{Normalisation}

An FSM which may be \emph{normalised}, which ensures that all of its
actions' destination sets are non-empty (empty ones become
self-loops), are sorted and free from duplicates.  Like
well-formedness checks, this can be useful for cleaning data after I/O
or conversion from other formats.

\begin{haskell}
normalise :: FSM sy -> FSM sy
\end{haskell}


\subsubsection{Computing actions/equivalences for strings}

Given the actions for two strings $\word$ and $\wordAlt$,
\texttt{append} computes the action for the string $\word\wordAlt$
formed by concatenating the strings.  We map over every source state
in the first action, and for every state in each destination set, we
(\texttt{appendAtState}) look up the set of destination states reached
from that state via the second action (\texttt{actionLookup}),
collecting the results for each source state by flattening the list
produced, sorting it, and removing duplicates.

\begin{haskell}
append :: Action -> Action -> Action
appendAtState :: DestinationSet -> Action -> DestinationSet
actionLookup :: Action -> State -> DestinationSet
\end{haskell}

While this works nicely as an easily comprehensible canonical
implementation of \texttt{append} --- and is handy for testing
purposes --- we suspect a more efficient algorithm, probably based on
a better representation than Haskell lists, will be found for future
versions of the library.

It's then quite straightforward to compute the action of a given word
(by folding \texttt{append} over the actions for the word's
constituent symbols), and to test action equivalences on words.  Note
that since a word may include symbols not in the machine's alphabet,
\texttt{action}'s return type (and much of the computation)
necessarily has type \texttt{Maybe Action}.

\begin{haskell}
action :: Ord sy => FSM sy -> Word sy -> Maybe Action
actionEquiv :: Ord sy => FSM sy -> Word sy -> Word sy -> Bool
\end{haskell}


\subsubsection{Computing destination sets/equivalences for strings}

Given a method to compute actions, computation of destination sets and
destination equivalences is then straightforward (with the same
comment regarding the use of \texttt{Maybe}).

\begin{haskell}
destinationSet :: Ord sy => FSM sy -> State -> Word sy -> Maybe DestinationSet
destinationEquiv :: Ord sy => FSM sy -> State -> Word sy -> Word sy -> Bool
\end{haskell}


\subsubsection{The identity action}

An FSM's \defn{identity action} is the action which maps all states
back onto just themselves.  Because this depends on the size of the
FSM, there is no general identity action for all FSMs --- which is the
primary reason \texttt{FSM} is not an instance of
\texttt{Data.Monoid}.

\begin{haskell}
fsmIdentity :: FSM sy -> Action
identity :: Int -> Action
\end{haskell}


\subsubsection{Determinism checks}

We can test if an action or FSM is determinstic.

\begin{haskell}
isDAction :: Action -> Bool
isDFSM :: FSM sy -> Bool
\end{haskell}


\subsection{\texttt{Data.FsmActions.WellFormed}}
\label{sec:wellformedness}

In our implementation, an FSM is well-formed provided:

\begin{itemize}

  \item Every action is the same length (the number of states in the
    machine).

  \item None of those destination sets contain out-of-range
    destination state values (i.e. negative or too high).

  \item It is (at least) \emph{weakly connected} (see
    \secref{sec:weakly_connected_components}); otherwise, it is
    \emph{disconnected}.  (Note that an FSM with no nodes is
    considered to be disconnected.)

\end{itemize}

An FSM should be checked for well-formedness after construction (which
typically involves some I/O) using the \texttt{isWellFormed} function,
which yields a \texttt{WellFormed} value encoding any problems found.

\begin{haskell}
data WellFormed sy = BadLengths [(sy, Int)]
                   | BadActions [(sy, Action)]
                   | Disconnected [[State]]
                   | WellFormed
                   deriving (Eq, Show)
isWellFormed :: Ord sy => FSM sy -> WellFormed sy
\end{haskell}

If all actions do not have the same length, the \texttt{BadLengths}
constructor contains a full list of (symbol, action length) pairs so
discrepancies may be identified.  Similarly, if some actions contain
out-of-range destinations, the \texttt{BadActions} constructor lists
the offending actions and their corresponding symbols.  If the FSM is
not at least weakly-connected, the \texttt{Disconnected} constructor
lists its weakly-connected components.  Finally, if all is well, the
\texttt{WellFormed} constructor is used.


\subsection{\texttt{Data.FsmActions.FsmMatrix}}

Matrix-based serialisation/deserialisation of FSMs.  Here an FSM is
represented as a single matrix, with one (newline-separated) row per
state, and one (whitespace-separated) column per symbol.  The first
row contains symbol names.  Subsequent rows contain destination sets
as comma-separated lists of states.  State/row numbering starts at 0.

Note that this representation currently suffers from the restriction
that symbol names may not contain whitespace. TODO: check, and then
consider fixing this (or at least providing a workaround,
e.g. allowing double-quoted symbol names).

TODO: Say more here, and give examples.


\subsection{\texttt{Data.FsmActions.ActionMatrix}}

Actionmatrix-based serialisation/deserialisation of FSMs.  Here an FSM
is represented as a collection of binary adjacency matrices, one per
action.

TODO: Say more here, and give examples.


\subsection{\texttt{Data.FsmActions.Graph}}
\label{sec:weakly_connected_components}

\texttt{fsmToFGL} converts an \texttt{FSM} into an
\texttt{fgl}\footnote{Functional Graph Library:
  \texttt{http://hackage.haskell.org/package/fgl}} graph with labelled
edges and unlabelled nodes.  The \texttt{SelfLoops} data type encodes
whether to keep all self-loops, or trim all those which are not
sources of nondeterminism --- see
\secref{sec:nondeterminism_and_self_loops}.

\begin{haskell}
data SelfLoops = Keep | Trim deriving Eq
fsmToFGL :: FSM sy -> SelfLoops -> Gr () sy
\end{haskell}

\texttt{strongCCs} and \texttt{weakCCs} compute an \texttt{FSM}'s
strongly-connected and weakly-connected components respectively, where
the WCCs are the SCCs of an \emph{undirected} graph of the machine.

\begin{haskell}
strongCCs :: Eq sy => FSM sy -> [[State]]
weakCCs :: Eq sy => FSM sy -> [[State]]
\end{haskell}

\begin{figure}
  \centering
  \subfloat[\texttt{strong}]{
    \begin{minipage}[b]{0.25\textwidth}
      \label{fig:strongly_connected_fsm}
      \centering
      \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=2.8cm,
                          semithick]
        \tikzstyle{every state}=[fill=blue,draw=none,text=white]
        \node[state] (A)              {$0$};
        \node[state] (B) [below of=A] {$1$};
        \node[state] (C) [right of=A] {$2$};
        \node[state] (D) [below of=C] {$3$};
        \path (A) edge [bend right] node {x} (B)
                  edge [bend right] node {y} (C)
              (B) edge [bend right] node {x} (A)
              (C) edge [bend right] node {x} (D)
                  edge [bend right] node {y} (A)
              (D) edge [bend right] node {x} (C);
      \end{tikzpicture}
    \end{minipage}}
  \hspace{0.08\textwidth}
  \subfloat[\texttt{weak}]{
    \begin{minipage}[b]{0.25\textwidth}
      \label{fig:weakly_connected_fsm}
      \centering
      \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=2.8cm,
                          semithick]
        \tikzstyle{every state}=[fill=blue,draw=none,text=white]
        \node[state] (A)                       {$0$};
        \node[state] (B) [below of=A]          {$1$};
        \node[state] (C) [fill=red,right of=A] {$2$};
        \node[state] (D) [fill=red,below of=C] {$3$};
        \path (A) edge [bend right] node {x} (B)
                  edge              node {y} (C)
              (B) edge [bend right] node {x} (A)
              (C) edge [bend right] node {x} (D)
              (D) edge [bend right] node {x} (C);
      \end{tikzpicture}
    \end{minipage}}
  \hspace{0.08\textwidth}
  \subfloat[\texttt{disc}]{
    \begin{minipage}[b]{0.25\textwidth}
      \label{fig:disconnected_fsm}
      \centering
      \begin{tikzpicture}[->,>=stealth',shorten >=1pt,auto,node distance=2.8cm,
                          semithick]
        \tikzstyle{every state}=[fill=blue,draw=none,text=white]
        \node[state] (A)                       {$0$};
        \node[state] (B) [below of=A]          {$1$};
        \node[state] (C) [fill=red,right of=A] {$2$};
        \node[state] (D) [fill=red,below of=C] {$3$};
        \path (A) edge [bend right] node {x} (B)
              (B) edge [bend right] node {x} (A)
              (C) edge [bend right] node {x} (D)
              (D) edge [bend right] node {x} (C);
      \end{tikzpicture}
    \end{minipage}}
  \caption{Strongly connected, weakly connected, and disconnected FSMs}
  \label{fig:sccs}
\end{figure}

Consider figure \ref{fig:sccs}.  \texttt{strong} is
strongly-connected; \texttt{weak} is weakly connected but not strongly
connected; \texttt{disc} is not even weakly-connected (it is
\emph{disconnected}):

\begin{math}
\begin{array}{rcccccl}
  \textrm{strongCCs}(\textrm{strong}) &=& \textrm{weakCCs}(\textrm{strong}) &=& \textrm{weakCCs}(\textrm{weak}) &=& \{\{0,1,2,3\}\} \\
  \textrm{strongCCs}(\textrm{weak})   &=& \textrm{strongCCs}(\textrm{disc}) &=& \textrm{weakCCs}(\textrm{disc}) &=& \{\{0,1\},\{2,3\}\}
\end{array}
\end{math}


For graph output, \texttt{fsmToDot} converts an \texttt{FSM} to a
\texttt{DotGraph}\footnote{From the \texttt{graphviz} library:
  \texttt{http://hackage.haskell.org/package/graphviz}} with labelled
edges and integer-labelled nodes, and all self-loops removed except
sources of nondeterminism (see \secref{sec:wellformedness}).  It's
fairly trivial; more sophisticated graph-drawing could be achieved by
rolling your own based on this code.  Similarly, \texttt{fsmToGML}
converts an \texttt{FSM} to a GML\footnote{Graph Modelling Language:
  \texttt{http://en.wikipedia.org/wiki/Graph\_Modelling\_Language}}
format graph.

\begin{haskell}
fsmToDot :: (Ord sy, CleanShow sy) => FSM sy -> DotGraph
fsmToGML :: CleanShow sy => FSM sy -> Doc
\end{haskell}

Note that both of these functions require the symbol type to be an
instance of the \texttt{CleanShow} typeclass.  Defined in this same
module, this subclasses the standard \texttt{Show} typeclass,
providing a \texttt{cleanShow} function which acts like \texttt{show}
except that it special-cases \texttt{String} and \texttt{Char} so they
are not quoted.  For example, while \texttt{show "s"} is \texttt{"s"},
\texttt{cleanShow "s"} is just \texttt{s}, which is more like what you
want for graph labels.  Anyway, labels of any type which is an
instance of \texttt{Show} should `just work' with the default
implementation.


\section{Future work}
\label{sec:future_work}


\subsection{More examples}

Add more examples of the various representations.


\subsection{Quickcheck tests}

We currently have some HUnit-based unit tests, but there are some
properties which could usefully be QuickChecked, for example:

\begin{quote}
  $\forall \word, \wordAlt, \wordAltAlt \in \strings \suchthat w = xy
    \implies \action{w} = append(\action{x}, \action{y})$
\end{quote}


\subsection{Faster representations}

Plain Haskell lists are surely not the fastest representation we could
use; in particular, appending actions is based on many lookups of
destination sets within actions, via \texttt{(!!)}, which is $O(n)$.
A tree-based representation should give us $O(\log n)$.  An unboxed
representation (since, ultimately, we're just looking up integers)
would be even better --- but as destination sets vary in size, it's
not currently clear how to implement that.  Memoization may also have
a role to play.


\subsection{Interfaces to other packages}
\label{sec:future_work_interfaces}

\texttt{halex}?


\section{History}
\label{sec:history}


\subsection{version 0.1 -- June 30, 2009}

First version of the library, Cabal-ised and released to Hackage.
Naive implementation based on plain Haskell lists.  Support for
\texttt{ActionMatrix} and \texttt{FsmMatrix} serialisation formats.
HUnit for unit tests.


\subsection{version 0.2.0 -- July 18, 2009}

Cleaned up \texttt{WellFormed}, and broke it out into its own module.
Added \texttt{ActionMatrix} output and some test cases.  Added
\texttt{fgl} interface, including computation of
strongly/weakly-connected components.  Added \texttt{graphviz}
interface.  Conform to Haskell package versioning
policy\footnote{\texttt{http://www.haskell.org/haskellwiki/Package\_versioning\_policy}}.


\subsection{version 0.3.0 -- July 20, 2009}

Added GML (Graph Modelling Language) output.  Unified FGL, GraphViz
and GML modules into \texttt{Data.FsmActions.Graph}.  Fixed for latest
\texttt{graphviz} library.


\subsection{version 0.4.0 -- October 21, 2009}

Added FGL (Functional Graph Library) input.  Added \texttt{FsmEdges}
input/output (graph format produced by Mathematica).  Added
`ActionSpecs' to \texttt{FsmActions} I/O formats, so that individual
\texttt{ActionMatrix} files don't have to be explicitly identified.
Added intelligent/unified load/save (in \texttt{Data.FsmActions.IO}).
Fixed for latest \texttt{graphviz} library.  Improved error handling.
Section \secref{sec:implementation} of this documentation is now
falling behind, however.

\subsection{version 0.4.3 -- July 20, 2011}

Alterations tracking updates to the \texttt{graphviz} library.


\end{document}

\endinput